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PREFACE 


This  volume  is  part  of  a  five- volume  set  that  summarizes  the  research  of  participants  in  the  1996  AFOSR 
Summer  Research  Extension  Program  (SREP.)  The  current  volume,  Volume  1  of  5,  presents  the  final 
reports  of  SREP  participants  at  Armstrong  Laboratory.  Volume  1  also  includes  the  Management  Report. 

Reports  presented  in  this  volume  are  arranged  alphabetically  by  author  and  are  numbered  consecutively  - 
eg.,  1-1,  1-2,  1-3;  2-1,  2-2,  2-3,  with  each  series  of  reports  preceded  by  a  35  page  management  summary. 
Reports  in  the  five-volume  set  are  organized  as  follows: 
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Modeling  and  Design  of  New  Cold  Cathode  Emitters 
Using  Wide  Bandgap  Semiconductors 


M.Cahay 

Associate  Professor 

Department  of  Electrical  Engineeering 
University  of  Cincinnati 


ABSTRACT 


We  analyze  the  importance  of  current  crowding  in  a  new  cold  cathode  emitter  which  consists 
of  a  thin  wide  bandgap  semiconductor  material  sandwiched  between  a  metallic  or  heavily 
doped  semiconductor  and  a  low  work  function  semimetcillic  thin  film.  Potential  material 
candidates  are  suggested  to  achieve  low- voltage  (<  10  V),  room-temperature  cold  cathode 
operation  with  emission  currents  of  several  tens  of  A/  cm*.  We  calculate  the  lateral  potential 
drop  which  occurs  across  the  emission  window  of  cold  cathodes  with  circular  geometry  and 
describe  its  effects  on  the  emitted  current  density  profile.  The  power  dissipation  in  the 
cold  cathode  is  calculated  as  a  function  of  a  dimensionless  parameter  characterizing  the 
importance  of  current  crowding.  We  determine  the  range  of  dc  bias  over  which  cold  cathodes 
of  different  radii  must  be  operated  to  minimize  current  crowding  and  self-heating  effects. 
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I.  INTRODUCTION 


Recently,  there  has  been  renewed  interest  into  cold  cathode  emitters  for  applications  to 
a  variety  of  electronic  devices,  including  microwave  vacuum  transistors  and  tubes,  pressure 
sensors,  thin  panel  displays,  high  temperature  eind  radiation  tolerant  sensors,  among  others 
[1,  2].  Introduction  of  such  emitters  would  permit  am  unprecedented  compactness  and  weight 
reduction  in  device  and  equipment  design.  Low  temperature  operation  in  nonthermionic 
electron  emitters  is  very  desirable  for  keeping  the  statisticad  energy  distribution  of  emitted 
electrons  as  nairrow  as  possible,  to  minimize  thermaJ  drift  of  solid  state  device  characteristics, 
amd  to  avoid  accelerated  thermal  aging  or  destruction  by  internal  mechamicad  stress  amd 
fatigue.  To  keep  the  emitter  temperature  rise  smadl  appears  eaay  if  the  emitters  are  built  as 
thin  epitaxial  films  using  vertical  layering  technology  due  to  the  extremely  short  heatpaths 
amd  excellent  heatsinking  possibilities  offered  with  this  architecture.  For  am  electron  emitter 
to  be  useful  in  microwave  tube  applications  it  should  be  capable  of  delivering  current  densities 
in  excess  of  10  A/  cm'^  amd  to  sustadn  emission  during  operational  lifetimes  over  periods  of 
10®  hrs.  To  satisfy  this  requirement,  the  structural  and  chemical  composition  must  be  stable. 
This  rules  out  the  historically  practiced  use  of  alkali  metal  films  on  emitter  surfaces  for  the 
lowering  of  electronic  work  functions.  These  films  sublimate,  evaporate  or  surface  migrate 
over  time  and  end  up  on  various  surfaces  inside  the  vacuum  envelop. 

Severad  cold  cathode  emitters  have  been  proposed  since  their  first  successful  demon¬ 
stration  by  Williams  and  Simon  [3]  using  a  cesiated  p-type  GaP  structure.  A  review  and 
criticism  of  the  different  cold  cathode  approaches  was  given  recently  by  Akinwande  et  al. 
[4].  In  this  work,  we  propose  a  new  cold  cathode  emitter  concept  and  use  a  simple  model  to 
show  that  the  new  emitter  is  capable  of  achieving  low  voltage  (<  10  V)  room  temperature 
operation  with  emission  current  approaching  100  A/cm*  and  large  efficiencies.  A  prelimi¬ 
nary  report  of  this  work  has  been  published  earlier  [5].  The  architecture  of  the  structure  is 
shown  in  Fig.  1.  The  main  elements  in  the  design  and  functioning  of  such  an  emitter  zu'e 
:  (1)  a  wide  bandgap  semiconductor  slab  equipped  on  one  side  with  a  metallic  contact  [6] 
or  a  heavily  doped  semiconductor  —  InP)  on  one  side  of  an  undoped  CdS  region  that 
supplies  electrons  at  a  sufficient  rate  into  the  conduction  band  and  (2)  on  the  opposite  side, 
a  thin  semimetallic  film  that  facilitates  the  coherent  transport  (tunneling)  of  electrons  from 
the  semiconductor  conduction  band  into  vacuum.  Of  importance  is  the  mutueJ  alignment 
of  the  crystalline  energy  levels  at  the  semiconductor-semimetal  film  junction.  This  requires 
the  use  of  new  materials  and  development  of  their  epitaxial  growth  technologies.  For  that 
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reason,  the  choice  of  InP  as  a  substrate  is  particularly  attractive  since  the  lattice  constant 
of  InP  (5.S6  A)  closely  matches  the  lattice  constant  of  the  zincblende  cubic  CdS  (5.83  A). 
Furthermore,  there  have  been  recent  reports  on  the  deposition  of  crystalline  layers  of  CdS  on 
InP  by  molecular  beam  epitaxy  [7],  chemical  bath  deposition  [8],  and  pulsed  laser  deposition 
[9].  The  proposed  cold  cathode  should  therefore  be  realizable  with  present  day  technology. 

As  shown  in  Fig.  1(a),  a  thick  metal  grid  is  defined  on  the  surf2u:e  of  the  LaS  thin 
film  to  bias  the  structure.  There  are  openings  in  the  grid  structure  to  expose  the  thin  LaS 
film  which  forms  the  active  emission  area  of  the  cold  cathode.  Cathodes  with  rectangular 
(Fig.l(b))  emission  windows  were  studied  previously  [5].  Current  crowding  and  self-healing 
effects  in  cathodes  with  circular  geometry  (Fig.l(c))  emission  windows  will  be  considered 
hereafter.  The  bias  is  applied  between  the  back  metallic  contact  cuid  the  metal  grid  with 
emission  occuring  from  the  exposed  LaS  surface.  If  the  applied  voltage  is  equal  or  larger 
than  the  semiconductor  bandgap  energy  and  the  quotient  of  the  applied  voltage  divided  by 
the  semiconductor  thickness  approaches  O.lcV/A,  then  electrons  are  tunnel  injected  into  the 
conduction  band  amd  ascend  during  their  travel  across  the  semiconductor  film  to  levels  of 
increasing  energy.  Referring  to  Fig.  2,  the  conduction  band  of  the  wide  bandgap  semicon¬ 
ductor  provides  the  launching  site  for  electrons  where  they  are  -  through  a  thin  film  -  injected 
into  vacuum.  This  injection  of  electrons  into  vacuum  becomes  possible  and  is  effective  as 
long  as  the  semimetallic  film  is  very  thin  and  has  a  work  function  small  enough  so  that  its 
vacuum  edge  is  located  energetically  below  the  conduction  bamd  edge  of  the  semiconductor. 
This  situation  is  referred  to  as  negative  electron  affinity  (NBA)  for  the  semiconductor  mar- 
terial  [10].  Depending  on  the  particular  materials  choices,  this  implies  that  the  semimetal 
work  function  4>m  in  relation  to  the  semiconductor  energy  bandgap  Ea  must  obey  one  of  the 
inequzilities  <  0.5Bc  or  <i)M  <  Be  if  an  intrinsic  or  p-type  doped  wide  bandgap  semi¬ 
conductor  is  used,  respectively.  A  negative  4>m  iniplies  according  to  Fig.2  that  the  vacuum 
level  would  be  located  below  the  lower  conduction  band  edge.  In  that  case,  electrons  in 
the  conduction  bzmd  with  momenta  pointing  toward  the  surface  have  a  good  chance  to  get 
emitted  unless  deflected  by  collision  or  trapped  by  impurities  or  defects. 

This  paper  is  organized  as  follows.  In  section  II,  we  derive  the  basic  equations  describing 
the  forwfird  bias  operation  of  the  cold  cathode  emitter  described  above.  We  then  calculate 
the  current  density- voltage  characteristics  of  the  newly  proposed  cold  cathode  for  specific  sets 
of  materials  cind  device  parameters.  In  section  III,  we  investigate  the  importance  of  current 
crowding  effects  in  various  cold  cathodes  with  circular  geometry.  Our  analysis  includes  a 
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self-consistent  modeling  of  current  crowding  effects  and  an  analysis  of  power  dissipation  in 
the  cold  cathode  active  area.  The  influence  of  power  dissipation  on  self-heating  effects  in  the 
active  area  of  the  cathode  is  also  described.  FinaJly,  Sec.  IV  contains  our  conclusions. 

II.  THE  MODEL 

Hereafter,  we  analyze  the  cold  cathode  whose  energy  band  diagram  is  shown  in  Fig.  2 
[5|.  Under  the  influence  of  a  large  electric  field  in  the  wide  bandgap  semiconductor,  electrons 
will  eventually  tunnel  from  the  left  contact  through  the  barrier  at  the  metal-semiconductor 
interface.  A  portion  of  the  current  emitted  at  the  metal  or  heavily  doped  semiconductor 
-CdS  contact  (which  we  model  assuming  Fowler- Nordheim  injection)  is  transmitted  at  the 
boundary  of  the  LaS  as  well  as  the  vacuum  boundary.  However,  a  fraction  of  the  current  is 
lost  in  the  thin  LaS  quantum  well  gives  rise  to  the  dynamic  shift  of  the  effective  material 
work  function  (Fig.  2).  For  a  cathode  operated  at  room  temperature,  we  model  this  internal 
field  emission  at  the  injection  junction  using  a  Fowler-Nordheim  (FN)  type  expression  for 
the  injected  current  (in  A/cm^)  [11] 

J™  =  (1) 

where  Ci  and  Cj  are  constants  which  depend  on  the  wide  bandgap  semiconductor.  In  our 
numerical  simulations,  we  chose  Ci  =  I.SATIO®  A/V  and  C2  =  6.9X10^(V*^^cm)“^  which 
are  of  the  same  order  of  magnitude  as  the  constants  appearing  in  the  FN  expression  [11]. 
In  Eq.(l),  A  is  the  barrier  height  (in  eV)  at  the  metal-semiconductor  junction  and  E  is 
the  electric  field  (in  V/cm)  in  the  wide  bandgap  semiconductor  [12].  We  assume  that  the 
semiconductor  layer  thickness  is  such  that  the  transport  of  injected  electrons  is  close  to  being 
ballistic  up  to  the  interface  between  the  semiconductor  and  the  thin  semimetallic  film.  In  so 
doing,  we  also  neglect  carrier  ionization  processes  in  the  semiconductor  slab  which  could  be 
the  main  antagonist  to  ballistic  tr<uisport  in  that  region. 

Because  of  the  finite  probabilities  for  the  injected  current  to  be  transmitted  at  the 
semiconductor-semimetal  (probability  Ti)  and  semimetal-vacuum  interfaces  (probability  Tj), 
the  contributions  to  the  total  emitted  current  can  be  calculated  as  the  sum  of  the  contribu¬ 
tions  resulting  from  the  mutiple  reflections  of  electrons  in  the  semimetallic  layer  (See  Fig.  2). 
The  magnitudes  of  the  emitted  current  components  decreases  with  the  number  of  multiple 
reflections  in  the  semimetallic  layer.  Rather  than  trying  to  calculate  these  contributions 
exactly,  we  assume  that  the  current  amplitude  is  decreased  by  a  factor  e  =  exp[— ^Las) 
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for  each  traversal  of  the  semimetallic  layer,  where  Xus  is  the  collisional  mean  free  path  in 
the  semimetallic  layer  and  L2  is  the  length  of  the  semimetallic  layer.  Adding  the  contribu¬ 
tions  resulting  from  multiple  crossings  of  the  semimetallic  layers,  the  total  emitted  current 
is  found  to  be 

Jem  =  (TiT^Joil  +  X  +  +  ...),  (2) 

where  x  =  c^(l  —  ri)(l  -  Tj).  In  cjJculating  Jem  we  limited  the  number  of  traversals  of 
the  semimetallic  slab  to  five  to  include  the  fact  that  electrons  loose  energy  in  each  crossing 
and  eventually  do  not  have  enough  energy  to  surmount  the  barrier  at  the  semimetal-vacuum 
interface.  According  to  Eq.(2),  the  contributions  from  the  multiple  reflections  decrease 
rapidly  since,  in  general,  the  quantity  z  will  be  much  smaller  than  unity  [13].  Once  the 
emitted  current  is  found,  the  total  current  contributing  to  the  increase  in  the  sheet  carrier 
concentration  in  the  thin  semimetallic  film  can  easily  be  written  as  Jeapt  =  Jfs  —  Jm^  The 
total  trapped  current  is  then  given  by 

:t  =  AJcapt  =  ARJfn  (3) 

where  A  is  the  area  of  each  LaS  emission  window  which  in  practice  can  be  either  rectangular 
or  circular  (See  Fig.  lb  and  Ic).  In  Eq.(3),  R  is  the  trapping  coefficient  of  the  well 

R=l-eTiT2{l+x  +  x^).  (4) 

The  semimetallic  thin  film  can  be  modeled  as  a  quantum  well  (Q.W)  which  will  loose  the 
trapped  electrons  essentially  at  its  lateral  boundciries.  In  reality,  Fig.  1(a)  indicates  that 
not  all  electrons  will  move  to  the  three-dimensional  contact  regions  surrounding  the  thin 
semimetallic  layer  but  many  of  them  will  get  reflected  at  the  lateral  thin  film  layer  with 
an  average  probability  r  (calculated  for  electrons  with  the  Fermi  velocity  in  the  thin  film). 
The  exiting  number  of  electrons  will  depend  on  the  thickness  of  the  semimetallic  layer  and 
could  be  adjusted  by  intentional  peissivation  so  that  reflection  at  the  boundaries  of  the  thin 
semimetallic  film  could  be  tuned  from  almost  zero  to  nearly  unity.  Taking  into  account  the 
finite  reflection  amplitude  at  the  thin  film  boundaries,  the  leakage  current  of  the  Q.W  can 
be  rewritten 

^  =  2eLN2DVF{l-r),  (5) 

for  the  case  of  a  rectangular  emission  window  and 

=  2-xeaN2DVF{l  -  r),  (6) 
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for  the  case  of  a  circular  emission  window. 

In  Eqns.(5)  and  (6),  Qt  is  the  total  charge  captured  by  the  well,  e  is  the  magnitude  of 
the  electronic  charge,  N2D  is  the  excess  sheet  carrier  concentration  in  the  thin  film  due  to 
the  captured  electrons,  and  vf  is  the  Fermi  electron  velocity  in  the  semimetallic  thin  film. 
Under  steady  state  operation  of  the  cold  cathode,  the  excess  charge  in  the  two-dimensionad 
semimetallic  film  is  found  using  Eq.(3)  cind  imposing  the  current  balance  requirement  ^  = 
ir  =  AJcapt-  This  leads  to 

N2D  =  WJ^ptl2e{l  -  r)vF,  (7) 

for  the  case  of  a  rectangular  geometry  and 

N2D  =  aJc^pt/2e{l  -  r)vF,  (8) 

for  the  case  of  a  circular  geometry. 

Simultaneously,  the  change  in  the  excess  sheet  carrier  concentration  in  the  Q.W 
due  to  trapped  electrons  leads  to  the  occupation  of  the  boundstate  energy  levels  according  to 
the  energy  density  of  states  up  to  an  energy  level  which  will  establish  the  dynamic  Fermi  level 
Ef^-  The  Fermi  velocity  vp  entering  Eqns.(5)  and  (6)  must  be  calculated  self-consistently 
because  of  the  dynamic  work  function  shift  [Axi  illustrated  in  Fig.  2.  This  dynamical  shift 
|Axl  is  equal  to  \Ef^  -  where  is  the  Fermi  level  in  the  thin  semimetallic  layer 
under  zero  bias.  For  simplicity,  we  assume  that  the  electrons  in  the  conduction  band  of  the 
semimetallic  films  can  be  described  using  the  Sommerfeld  theory  of  metals  while  assuming 
s-band  conduction  in  the  semimetallic  thin  film  and  while  modeling  the  thin  film  using  the 
peirticle  in  a  box  model  for  the  quantum  well  [14].  The  set  of  equations  (1-8)  is  then  solved 
self-consistently  to  calculate  the  work  function  shift  |Ax|  as  a  junction  of  the  externally 
applied  bias.  Once  the  dynamic  shift  hais  been  determined  self-consistently,  Eq.(2)  can  then 
be  used  to  determine  the  emitted  current. 

RESULTS 

We  consider  a  specific  structure  with  the  material  eind  structural  parameters  listed  in 
Table  I  and  II,  respectively.  Both  Au  and  Ag  are  known  to  form  contacts  to  thin  films  of 
semiconducting  (n-type)  CdS.  In  that  case,  the  barrier  height  A  shown  in  Fig.  2  is  quite 
small  and  is  equal  to  0.7S  eV  and  0.56  eV  for  the  case  of  Au  and  Ag  contacts,  respectively 
[11].  The  lattice  constant  of  CdS  (5. S3  A)  is  very  close  to  the  lattice  constant  of  the  thin 
semimetallic  surface  layer  LaS  (5.S5  .A)  which  in  its  cubic  crystalline  structure  will  therefore 
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be  lattice  matched  to  the  semiconducting  material.  Additionally,  LaS  is  expected  to  have 
quite  a  low  room  temperature  work  function  (1.14  eV)  [15],  a  feature  when  combined  with 
the  large  energy  gap  (2.5  eV)  of  CdS  leads  to  NEA  of  the  semiconductor  material.  In  the 
following  numerical  simulations,  the  thicknesses  of  the  CdS  {Li)  and  LaS  (L2)  layers  are  set 
equal  to  500  A  eind  24.6  A  (4  monolayers),  respectively.  We  model  a  cathode  with  a  square 
(W  =  L)  emission  window  with  a  1  cm^  area. 

Figxire  3  is  a  plot  of  the  dynamic  work  function  shift  as  a  fimction  of  applied  bias  for 
the  cold  cathode  emitter  with  both  Au  «ind  Ag  injecting  contacts.  The  following  parameters 
were  used:  Xus  =  300A,  Ti  =  Tj  =  0.5,  and  vf  =  1.36X10* cm/s.  Figure  3  indicates  that 
the  dynamic  shift  of  the  LaS  work  function  is  sensitive  to  the  quality  of  the  interface  between 
the  two-dimensional  semimetallic  layer  and  the  three-dimensional  contacts  which  we  model 
by  varying  the  reflection  coefficient  r  between  the  two-dimensional  semimetallic  thin  film 
and  the  three-dimensional  contact  regions  (See  Fig. ,  1(a)).  It  should  be  noticed  that  the  LaS 
work  function  shift  can  approach  the  LaS  workfunction  even  for  the  case  of  a  leaky  interface 
between  the  thin  semimetallic  layer  and  the  3D  contact  regions.  The  dynamic  shift  |Ax|  » 
comparable  to  the  work  function  of  LaS  for  a  smaller  vadue  of  the  applied  bias  in  the  case  of 
Ag  contact  because  of  the  lower  barrier  at  the  Ag/CdS  interface. 

Figure  4  compares  the  emitted  current  densities  Jem  for  the  structure  with  Au  and  Ag 
contacts  calculated  while  including  or  neglecting  the  effects  of  the  djmamic  shift  of  the  LaS 
work  function.  The  current  density  versus  bias  plots  are  stopped  at  the  values  of  at 
which  |Axl  =  <pM{LaS)  =  1.14cV.  Beyond  that  point,  the  theory  exposed  here  is  no  longer 
valid  since  we  would  need  to  include  the  spill  over  of  the  excess  trapped  carriers  into  vacuum. 
As  can  be  seen  in  Fig.  4,  the  emitted  current  densities  can  be  more  than  a  factor  two  larger 
when  the  effects  of  the  dynamic  shift  of  the  work  function  of  the  semimetal  are  included. 
The  effects  could  be  meule  more  drastic  if  a  set  of  materizds  and  device  parameters  could  be 
found  for  which  the  dynamic  shift  of  the  work  function  could  be  made  comparable  to  the 
work  function  itself  at  fairly  low  value  of  the  applied  bias  (<  5V). 

Sensitivity  of  Dynamic  Work  Function  Shift  on  Design  Parameters 

The  previous  numerical  e.xamples  have  shown  that,  under  forward  bias  operation,  the 
electrons  captured  in  the  low  work  function  material  are  responsible  for  an  effective  reduction 
of  the  semimetallic  film  work  function  together  with  a  substantial  increase  of  the  cathode 
emitted  current.  This  dynamic  work  function  shift  was  shown  to  increase  with  the  amount  of 
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injected  current.  Here2dter,  we  perform  a  more  extensive  study  of  the  dynamic  work  function 
shift  which  includes  variations  of  the  length  of  the  CdS  region  (Li),  the  electron  mean  free 
path  in  the  LaS  region  (A^as),  the  emission  window  size  (W),  the  transmission  coefficients 
at  the  CdSfLaS  (Ti)  and  Z,a5/ Vacuum  interfaces  {T2),  and  the  reflection  at  the  2D/3D 
interface  region  in  the  LaS  quantum  well  (r). 

Figure  3  indicates  that  the  dynamic  work  function  shift  |  Ax|  rises  exponentially  above 
a  threshold  voltage  of  several  volts  and  reaches  rapidly  (within  a  few  volts  range)  a  value 
comparable  to  the  LaS  work  function.  For  a  structure  with  the  parameters  listed  in  Table  I 
and  with  the  structural  and  physical  parameters  {L\  =  500  A,  L^  —  24.6 A,  W  =  1cm,  = 
300A,  7\  =  Tj  =  0.5,  A(A3)  =  0.56eV  ),  Fig.  5(a)  shows  that  the  dynamic  work  function 
shift  rises  exponentially  at  a  lower  bizis  as  the  reflection  coefficient  at  the  2D/3D  interface 
in  the  LaS  region  approaches  unity.  Figure  5(b)  also  shows  that  the  difference  between 
the  current  densities  calculated  with  and  without  including  the  dynamic  work  function  shift 
are  more  pronounced  for  smaller  values  of  the  applied  bias  when  the  reflection  coefficient 
between  the  2D  and  3D  LaS  regions  is  approaching  unity.  This  results  from  the  fact  that 
any  mechanism  (like  r  being  closer  to  unity)  which  increases  the  amount  of  charge  being 
trapped  in  the  LaS  quantum  well  leads  to  an  enhancement  of  the  dynaunic  work  function 
shift  at  a  given  bias.  For  instance,  all  other  cathode  parameters  being  equ2d,  the  dynamic 
work  function  rises  much  faster  21s  a  function  of  applied  biM  in  structures  with  thnmer  CdS 
regions  (Fig.  6),  with  smaller  values  of  the  transmission  coefficients  Ti  and  T2  (Fig.  7),  or 
with  smaller  mean  free  path  {^Las)  in  the  semimetallic  thin  film  (Fig.  8).  As  shown  in  Fig. 
8(a),  the  dynamic  work  function  shift  occurs  at  a  lower  bias  as  Xi^s  is  decreased.  On  the 
other  hand,  the  emitted  current  density  is  lesser  at  a  given  bias  in  a  cathode  whose  LaS  thin 
film  has  a  lower  electron  mean  free  path  (Fig.  8(b)).  We  have  found  this  trend  to  be  vahd 
for  all  values  of  the  reflection  coefficient  at  the  LaS  2D/3D  interface.  However,  there  is  a 
larger  spread  in  the  family  of  curves  representing  the  bias  dependence  of  the  dynamic  work 
function  2uid  emitted  current  densities  as  a  function  of  the  mean  free  path  in  the  LaS  thin 
film  for  smaller  values  of  the  reflection  coefficient  r.  Finally,  even  though  not  shown  here, 
the  exponential  rise  of  the  dynamic  work  function  shift  and  emitted  current  density  has  been 
shown  to  occur  at  lower  value  of  the  bias  by  either  reducing  the  thickness  of  the  C dS  layer 
or  by  lowering  the  barrier  height  A  at  the  metal/Cd5  interface. 

Before  leaving  this  section,  we  make  two  additional  remarks  on  the  bias  dependence 
of  the  emitted  current  density  and  the  dynamic  work  function  shift  which  we  have  checked 
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numerically  on  all  the  cold  cathodes  modelled  in  this  work.  First,  since  the  emission  of 
electrons  at  the  metaJ(or  heavily  semiconductor)/Cd5  interf2u:e  is  assumed  to  be  of  the 
Fowler-Nordheim  type,  the  emission  current  is  expected  to  have  a  bias  dependence  of  the 
form 

Jfn  =  AVbia,^exp{-B/Viias).  (9) 

This  was  checked  numerically  for  all  the  cold  cathodes  modelled  here.  A  typical  example  is 
given  in  Fig.9(a),  with  the  values  of  the  parameters  A  and  B  listed  in  the  inset.  Most  of 
the  quantities  of  interest  to  be  determined  hereafter  (power  dissipation,  lateral  variation  of 
emitted  current  and  lateral  potential  drop,...)  can  be  calculated  exactly  analytically  if  the 
following  approximation  is  used 

Jfk  =  Ae"''*-.  (10) 

For  all  the  cold  cathodes  simulated  here,  we  have  shown  that  Eq.(lO)  gives  a  fairly  accurate 
fit  to  the  plot  of  the  emitted  current  density  versus  applied  bias  if  the  range  of  the  fit  is 
restricted  to  current  densities  between  1  and  1000  A/cm^  .  A  typical  fit  for  the  one  of  the 
cold  cathode  studied  here  is  shown  in  Fig.9(b)  with  the  values  of  the  parameters  Jq  and  a 
in  Eq.(8)  shown  in  the  inset.  Table  III  gives  a  summairy  of  the  paraimeters  Jq  and  a  for  cold 
cathodes  of  different  width  and  with  the  physical  parameters  listed  in  the  caption  of  Fig.5. 
We  have  found  that  the  bias  dependence  of  the  dynamic  work  function  shift  is  also  of  the 
Fowler-Nordheim  type,  i.e. 

Ax  =  (11) 

This  is  illustrated  in  Fig.  10  for  the  cold  cathode  with  the  same  parameters  as  in  Fig.9.  The 
values  of  Ao  and  Vq  are  indicated  in  the  inset  of  Fig.lO.  We  point  out  that  the  values  of  the 
parzuneters  B  amd  Vo  in  Eqns.(9)  and  (10)  aure  nearly  identical. 

Temperature  Rise  in  the  Cold  Cathode 

Hereafter,  we  derive  an  upper  estimate  of  the  temperature  rise  in  the  LaS  thin  film  as  a 
result  of  the  power  dissipation  mechanisms  discussed  above.  We  focuss  on  a  cold  cathode 
where  the  CdS  thin  film  is  deposited  on  a  InP  substrate  as  shown  in  Fig.l(a).  The  successful 
growth  of  cubic  CdS  thin  films  with  good  crystalline  quality  on  InP  substrates  has  been 
reported  recently  by  Shen  and  Kwok  [9].  For  the  case  of  a  InP/CdS  interface,  there  has  not 
been  any  report  of  the  conduction  band  discontinuity  ^Ec  at  the  interface  between  the  two 
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materials,  to  the  best  of  our  knowledge.  For  that  reason,  AEe  was  assumed  to  be  given  by 
Anderson’s  rule,  i.e,  A  in  Fig.  2  is  assumed  to  be  given  by  |x(/nP)  —  =  0-2  eV 

[16].  This  estimate  was  based  on  the  measured  electron  affinities  of  InP  (4.4  eV)  and  CdS 
(4.2  eV)  reported  in  refs.  [17]  and  [IS],  respectively.  The  back  contact  to  the  substrate  is 
assumed  to  be  perfectly  ohmic  and  to  act  as  a  perfect  heat  sink  (300K).  Since  the  device 
area  (heat  source  formed  of  the  LaS  thin  fim)  is  much  thinner  than  the  substrate,  it  is 
necessary  to  consider  the  effect  of  heat  spreading  laterally  in  the  substrate.  Furthermore, 
we  assume  that  the  temperature  of  the  CdS  layer  will  be  the  same  as  the  LaS  top  layer. 
The  active  area  of  the  cathode  (CdS  and  LaS  layers)  is  therefore  assumed  to  be  acting  as  a 
heat  source  with  the  power  density  calculated  in  the  previous  section.  Because  of  the  finite 
thermal  conductivity  of  the  substrate,  we  expect  self-heating  effects  to  affect  the  operation 
of  the  cold  cathode  if  the  power  level  dissipated  in  the  active  area  of  the  cathode  becomes 
too  important.  Hereafter,  we  model  the  thermal  conductivity  of  the  InP  substrate  as  follows 

KiT)  =  KoiT/To)-\  (12) 


where  kq  is  the  thermal  conductivity  at  To  (300K),  kq  =  0.74  W/Kcm  is  the  room  temperature 
thermal  conductivity  of  InP  and  b=1.45.  Starting  with  Fick’s  law  and  madcing  use  of  a 
Kirchoff  transformation  to  take  into  account  the  temperature  dependence  of  the  thermal 
conductivity  of  the  InP  substrate  given  by  Eq.(12),  it  can  be  shown  that  the  active  area  of 
the  cold  cathode  will  be  operated  at  a  temperature  given  by 


^  / 1  1  \  Rth,oPdxa$ 

6-1  T’ 6 


where  To  is  the  ambient  room  temperature  (assumed  to  be  300  K  hereafter),  Pn„  is  the  totad 
power  dissipated  per  finger  as  calculated  in  the  previous  section,  and 

1  dz 

Rthjo  =  —  /  -77-^.  (14) 

Kq  Jq  A(z) 

where  Zs  is  the  thickness  of  the  InP  substrate.  Our  estimate  of  the  temperature  rise  in  the 
active  cold  cathode  area  will  give  an  upper  estimate  of  the  temperature  of  operation  since 
we  neglected  heat  conduction  to  the  top  Au  contacts  in  the  thick  portion  of  the  LaS  thin 
film. 
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III.  Current  Crowding  Effects  in  Proposed  Cold 
Cathode  with  a  Circular  Geometry 


In  this  section,  we  study  the  effects  of  current  crowding  in  the  case  of  an  emission  window 
with  circular  geometry  (see  Fig.l(c)).  The  lateral  potential  drop  in  the  circular  LaS  window 
satisfies  the  following  differential  equation  [19] 


dr 


(15) 


where  i(r)  is  the  total  lateral  current  per  unit  length  flowing  outward  across  a  circle  of  radius 
r,  whose  center  coincide  with  the  center  of  the  emission  window.  The  lateral  current  satisfies 
the  following  equation 

2irri{r)  =  2ttR  f  dr  r  j{r  ).  (16) 

Jo 

If  we  further  assume  that  the  Fowler- Nordheim  emitted  current  j(r)  can  be  approximated 
by  Eq.(lO)  over  the  range  of  dc  bias  considered  here,  the  following  second-order  differential 
equation  must  be  satisfied  by  the  lateral  potential  drop: 


CpV  IdV  psRJo  ^Vlr\/VT 

dr'^  r  dr  t 

(17) 

This  differential  equation  must  be  solved  subject  to  the  following  boundary  conditions  valid 

for  the  circuljir  geometry 

dV 

(18) 

at  r  =  0,  and 

V{r  =  a)  = 

(19) 

at  the  edge  of  the  circular  window.  Introducing  the  reduced  variable  r 

—  r la  2Lnd  the 

quantity 

Vt  ’ 

(20) 

Eq.(17)  can  be  recast  as  follows 

(PY  IdY  ^  2  V 
—I  +  =  27' 

dr  r  dr 

(21) 

where  7^  =  £2!  and  the  parameter  /?  is  identical  to  the  one  defined  for  the  planar  problem 

[20]. 
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The  general  solution  of  Eq.(21)  can  be  found  aneJytically  and  is  given  by 

(22) 


where  S  and  c  are  constants  to  be  determined  so  the  boundary  conditions  (18)  and  (19)  are 
satisfied.  Using  the  new  system  of  variables,  Eq.(18)  becomes 


dy,  n 

dr' 

(23) 

while  Eq.(19)  now  reads 

O 

II 

(24) 

Equation  (23)  leads  to 

S  =  2. 

(25) 

and  Equation  (24)  becomes 

P  =  {l-c)hVc. 

(26) 

Combining  these  last  two  equations,  we  obtain  the  following  result 

2  4c 

^  "(l-cP’ 

(27) 

Equation  (27)  can  be  solved  exactly  for  the  parameter  c 

c  =  [7’  +  2  -  2y^7*  +  Ij/i’ 

(28) 

in  terms  of  which  we  can  write  various  quantities  of  interest,  including  the  ratio 

J(0)/Aa)  =  [1  -  cl^ 

(29) 

chau'acterizing  the  importiince  of  current  crowding  in  the  circular  geometry.  Using  Eqn.(22), 
the  radial  dependence  of  the  lateral  potenti2Ll  drop  if  found  to  be 

^(r)  =  - 2VrJn[°^^l,  (30) 

from  which  the  maximum  value  of  the  in  plane  electric  field  is  found  to  be 


Er(r  =  a) 


4c  Vr 
c  —  1  a  ’ 


(31) 


As  in  the  case  of  the  rectangular  geometry,  the  tot^d  power  dissipated  in  the  LaS  thin 
film  is  given  by  the  sum  of  the  following  four  contributions  [20].  The  power  dissipated  by 
the  electrons  being  trapped  in  the  LaS  circular  thin  film  is  given  by 

P,=2-Rrdrjir)V{r).  (32) 

Jo 
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The  power  dissipated  by  Joule  heating  as  trapped  electrons  move  to  the  edge  of  the  LaS 
window  is  given  by 

^  25^  /"  -1  r  r';(rV’-T.  (33) 

t  Jo  r  Jo 

The  third  contribution  to  power  dissipation  comes  from  from  Joule  heating  linked  to  the 
current  madcing  it  from  the  LaS  thin  film  to  the  Au  contacts  on  top  of  the  thick  LaS 


regions. 


P3  =  iic(23rat(a))^ 


where  Re  is  the  resistance  of  the  LaS  region  between  the  edge  of  the  LaS  thin  film  and  the 
top  Au  layer.  This  resistance  can  be  estimated  as  follows  [21] 

^  t2irak-V  ^  ’ 

where  k  =  hjt,  a  is  the  radius  of  the  circular  window,  and  H  is  the  height  of  the  thick  LaS 
region. 

Finally,  there  is  adso  a  contribution  to  power  dissipation  due  to  the  blocking  effect  on 
the  Folwer-Nordheim  emission  current  emitted  under  the  wide  LaS  contacts: 

P4  =  +  (36) 

Starting  with  Eqns.(lO)  and  (30),  the  different  contributions  to  the  power  dissipation 
can  be  calculated  exactly  and  are  found  to  be 

Pi  =  3raVoPe*''‘’*‘Vka.(l  -  c)  -  7raVoPe“‘'‘’**VT[2(l  -  c)  +  -  cf],  (37) 

c 

P2  =  P(7raVoe“''^«)^T(2(l  -  c)  +  ~  ^  ln{l  -  c)*],  (38) 

P3  is  given  by  Eq.(34)  and  P^  is  found  to  be 

where  is  the  total  emitted  current  through  the  circular  window 

U^2„(l-R)rri(r)dr.  (40) 

./o 

The  latter  can  be  calculated  explicitely  and  is  found  to  be 

/.„  =  (l-fl)(raVo«“''*-l(l-c).  (41) 
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The  input  power  (per  emission  window)  delivered  by  the  power  supply  biasing  the  cold 
cathode  is  given  by 

^input  ~  2i:rj{r)V{r)dr,  (42) 

which  can  readily  be  shown  to  be  given  by 

Pinpttt  =  Pi/ R  +  Pa-  (43) 

The  power  efficiency  of  the  cold  cathode  can  be  calcualted  as  follows 

VP  =  - •  (44) 

and  the  temperature  rise  in  the  cathode  is  given  by  Eq.(13).  where  the  thermal  resistance 
Rth,o  must  be  calculated  for  the  case  of  power  dissipation  through  the  substrate  from  a  heat 
source  with  circxilar  geometry.  In  this  case,  we  find 

\  (45) 

<0  a  [a  +  z,tan&\ 

in  which  the  heat  spreading  angle  9  is  set  equal  to  45**  in  the  numerical  examples  below,  for 
simplicity. 


Numerical  Examples 


Figure  11  shows  the  variation  of  the  parzuneter  c  in  Eq.(28)  as  a  function  of  applied 
bias  for  cold  cathodes  with  circular  emission  window  of  different  radii.  For  all  cathodes, 
the  physical  parameters  are  the  same  as  listed  in  the  caption  of  Fig.5.  As  in  the  C2ise  of  a 
rectangular  window,  we  use  the  criterion  that  current  crowding  is  negligible  if  the  lateral 
potential  drop  between  the  center  and  the  edge  of  the  LaS  circular  window,  V(r  =  0)  —  V^, 
is  kept  less  than  O.lVr-  Using  Eqns.(28)  and  (30),  we  find  that  this  criterion  requires  the 
parameter  c  to  be  less  than  0.05.  This  limit  is  indicated  as  a  vertical  line  in  Fig.  11.  The 
family  of  curves  in  Fig.  11  is  parametrized  with  the  radius  of  the  emission  window.  Figure  11 
shows  that  the  range  of  dc  bias  over  which  current  crowding  can  be  neglected  in  a  circular 
window  is  comparable  to  the  range  of  dc  bias  over  which  current  crowding  is  negligible  in  a 
rectangular  window  whose  width  is  equal  to  the  radius  of  the  circular  emission  window  [20]. 

Figure  12  illustrates  the  importance  of  current  crowding  on  the  latered  potenti2d  drop 
in  emitter  windows  of  different  radii.  The  left  frames  show  the  radial  dependence  of  the 
electrostatic  potential  for  four  different  vedues  (1,10,100,1000  k/arP)  of  the  current  density  at 
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the  rim  of  the  circular  LaS  window.  The  right  frames  in  Fig.  12  show  the  corresponding  radial 
dependence  of  the  emitted  current  density.  From  Fig.  12,  it  can  be  seen  that  current  crowding 
is  negligible  in  emitter  windows  with  radius  less  than  50  nm  if  the  emitted  current  density 
is  kept  under  10  Alcm?.  As  in  the  case  of  LaS  windows  with  rectangular  geometry,  Figure 
12  shows  that  the  current  density  profiles  are  much  more  sensitive  to  the  finite  resistivity  of 
the  LaS  thin  film  than  the  lateral  potential  drop. 

Figure  13  is  a  plot  of  the  four  contributions  to  the  total  power  dissipated  in  cold 
cathodes  with  different  radii  plotted  as  a  function  of  the  parameter  c.  For  all  cathodes,  the 
power  dissipation  due  to  Joule  heating  in  the  LaS  thin  film  and  the  thick  LaS  regions  is 
negligible  compare  to  the  power  released  by  electrons  being  trapped  in  the  LaS  thin  film  and 
by  electrons  blocked  in  the  thick  LaS  regions.  The  latter  is  always  about  one  of  magnitude 
higher  than  the  former.  Figure  13  shows  that  substantial  power  dissipation  occurs  in  the 
cathode  with  radius  under  bOfim  while  the  cathode  is  still  operating  without  any  substantisJ 
current  crowding  effects  (i.e,  c  <  0.05).  The  power  efficiency  j/p  of  cathodes  of  different 
width  is  plotted  as  a  function  of  Vbia*  and  the  emitted  current  density  Jtm  in  Figs-  14(a) 
and  14(b),  respectively.  For  all  window  size,  the  efficiency  decreases  with  V^oj  and  as  a 
result  of  current  crowding.  The  efficiency  is  more  or  less  constamt  over  a  wider  range  of 
for  window  with  smaller  radius  because  current  crowding  is  less  important  in  that  case,  as 
illustrated  in  Fig.  12.  The  overall  lower  efficiency  for  window  with  smaller  radius  illustrated 
in  Fig.  14  comes  from  the  fact  that  the  width  of  the  thick  LaS  regions  was  set  equal  to 
lOO/im  for  all  cathodes.  The  efficiency  of  cathodes  could  be  increased  by  making  the  ratio 
b/a  in  Fig.  1(b)  closer  to  unity. 

Figure  15  shows  the  temperature  of  the  active  area  of  a  cold  cathode  with  the  pauram- 
eters  listed  in  Table  I  as  a  function  of  Hia,  for  emitter  window  with  different  radii.  The 
thickness  of  the  LaS  contacts  and  InP  substrate  wets  set  equal  to  lOOA  and  lOO^m,  respec¬ 
tively.  As  in  case  of  cold  cathodes  whose  emission  window  as  a  rectangular  geometry  [20], 
Fig.  15  indicates  that  to  limit  the  temperature  rise  in  any  cathode  to  less  than  200  K,  the 
dc  bias  must  be  limited  to  a  smaller  range  for  emitters  with  smaller  window  radius.  For 
instance,  according  to  Figures  11  and  15,  a  cathode  with  a  20;im  diameter  can  be  operated 
up  to  8.3  V  with  negligible  self-heating  effects  (AT  around  100  K  ).  For  that  bias,  Figure  12 
indicates  that  current  crowding  would  be  negligible  in  the  cathode  and  the  emitted  current 
density  would  be  around  100  .A/cm^  (See  Fig.  16).  On  the  other  hand,  Figure  11  indicates 
that  a  100/im  diameter  window  can  be  operated  up  to  6.8  V  before  current  crowding  becomes 
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non  negligible.  At  this  bias,  the  emitted  current  density  would  be  around  15  Afcxn^  (Fig. 
16)  while  the  temperature  rise  in  the  device  would  be  only  about  15  K  as  shown  in  Fig.  15. 

IV.  CONCLUSIONS 

We  have  proposed  a  new  cold  cathode  emitter  which  consists  of  a  thin  wide  bandgap  semi¬ 
conductor  materizd  sandwiched  between  a  metallic  material  or  heavily  doped  semiconductor, 
and  a  low  work  function  semimetallic  thin  film.  We  have  shown  that  the  capture  of  electrons 
by  thin  semimetallic  layers  grown  on  the  escape  surface  of  wide  bandgap  semiconductors  can 
lead  to  a  dynamical  shift  of  the  work  function  of  the  semimetallic  layers  together  with  an 
increase  of  the  cathode  emission  current.  While  varying  the  device  and  physical  parameters 
of  the  structure,  our  studies  suggest  that  any  mechanism  which  promotes  additional  charge 
deposit  in  the  well  enhances  the  dynamic  work  function  shift  phenomenon  thereby  increasing 
the  emitted  current.  Potential  material  candidates  were  proposed  for  cold  cathode  operation 
with  applied  bias  under  10  V,  with  current  densities  approaching  several  tens  of  A/cm^,  and 
with  lau’ge  power  eflBciencies  {rip  approaching  15  %). 

The  results  of  our  analysis  show  that  a  cold  cathode  with  either  a  rectangular  or  circular 
emission  window  and  with  the  parameters  listed  in  Tables  I  and  II  would  emit  a  uniform 
current  density  of  about  15 A/cm^  at  a  dc  biasing  voltage  of  about  8V.  For  that  bias,  the 
effects  of  current  crowding  would  be  negligible  and  the  temperatxire  rise  in  the  active  area 
of  the  cathode  (CdS/LaS  layers)  as  a  result  of  self-heating  effects  would  be  negligible. 

Further  improvements  to  the  theory  should  include  a  more  realistic  model  for  the  emis¬ 
sion  current  and  transport  through  the  wide  bandgap  materisJ.  Also,  the  semimetalhc  film 
energy  density  of  states  (to  account  for  the  d-band  char2w:ter  of  the  conduction  band  in  the 
chosen  rare-earth  semimetcillic  samples  [14]),  the  finite  probability  for  electron  wavefunctions 
in  the  thin  semimetallic  films  to  extend  in  the  semiconductor  material  [22],  a  more  accurate 
description  of  the  energy  loss  mechanisms  [23]  and  screening  effects  (including  the  lateral 
ohmic  voltage  drop)  in  thin  semimetallic  layers  [24].  Finally,  our  study  of  self-heating  effects 
should  include  partial  cooling  of  the  cathode  due  to  heat  conduction  through  the  thick  LaS 
layers  which  was  neglected  in  this  study.  The  latter  would  allow  to  extend  slightly  the  dc 
biasing  operating  range  of  the  cathode  beyond  the  estimate  reported  here.  Furthermore,  the 
thermal  aind  electrical  models  of  the  cathode  described  here  should  be  solved  self-consistently. 
Once  all  these  effects  are  taken  into  account,  we  believe  the  quantitative  operation  of  the 
cold  cathode  exposed  here  will  stay  essentially  correct  predicting  a  dynamical  shift  of  the 
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work  function  of  the  thin  scmimetallic  film  of  the  same  order  of  magnitude  than  the  (me 
reported  here. 

Our  analysis  provides  the  basic  design  rules  to  fabricate  a  new  c»ld  cathode  with 
emission  windows  with  a  rectangxiler  or  circular  geometry.  The  growth  of  the  structure 
would  require  the  epitaxial  growth  of  the  structure  shown  in  Fig.l.  As  discussed  above,  the 
epitaxial  growth  of  InPjCdS  heterostructures  has  been  reported  in  the  literature  in  the 
past  [9].  The  deposition  of  epitaxial  LaS  thin  films  has  not  been  reported,  to  the  best  of 
our  knowledge.  We  believe,  however,  that  the  figures  of  merits  of  the  various  cold  cathodes 
analyzed  in  this  work  are  a  strong  incentive  towards  the  experimentjJ  investigation  of  these 
devices.  If  successful,  such  an  experimental  effort  would  lead  to  big  pay-offs  with  the  design 
of  highly  eflBcient  cold  cathodes  for  large  panel  displays,  IR  image  convertors  and  sensors,  and 
2u:tive  power  devices  in  mobile  and  airborne  electronic  equipment  for  military,  commercnal, 
zmd  private  use. 
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Table  I:  MateriaJ  Parameters  of  the  Cold  Cathode 


Material 

Au  (Ag) 

n++  -InP 

i-CdS 

LaS 

Lattice  Thickness  (A) 

optional 

optional 

300.0 

24.6 

Lattice  Constant  (A) 

4.04  (4.09) 

5.86 

5.83 

5.85 

Workfimction  (eV) 

4.3  (4.3) 

4.4 

4.2 

1.14 

Bandgap  (eV) 

1.42 

2.5 

#  of  fr«e  electrons  (10^^ cm~^) 

n++ 

5.9  (5.86) 

— 

1.99 

Electron  Mass  (mo) 

1.0 

0.0765 

0.14 

1.0 

Electron  Mobility  (cm^V'^s"^) 

5370. 

400.0 

Thermal  conductivity  Q  300K  (W/cmK) 

3.1  (4.18) 

0.74 

0.05..1 

0.17 

Electrical  resistivity  (273K)  (/iflcm) 

1.51  (2.04) 

92.0 

Melting  temperature  (K) 

1335. 

2500 

Table  II:  Physical  Parameters  of  the  Cold  Cathode 


Thickness  of  InP  substrate  50  -  200 

Thickness  of  CdS  thin  film  300  -  500  A 

Emission  window  length  1  cm 

Thickness  of  LaS  thin  film  24.6  A 

Thickness  of  LaS  thick  regions  100  -  500  A 


Electron  mean  free  path  in  LaS  thin  film  50  -  300  A 
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Table  III:  Parameters  for  numerical  fit  (Eq.(lO))  to  the  Fowler-Nordheim  current 
expression  (Eq.(9)).  The  fit  is  made  over  a  current  density  range  from  1  to  1000  A/cm*. 


Width  of  Rectangular  Window  (/im)  Jo  (A/cm*) 

a  (V-‘) 

10,000 

7.2186X10-^ 

2.3651 

1,000 

3.6483X10-* 

2.1302 

500 

3.8505X10-* 

2.1224 

200 

4.7337X10-® 

2.0933 

100 

4.7902X10-® 

2.0916 

50 

4.8207X10-^ 

2.0907 

20 

4.8424X10-« 

2.0901 

10 

4.8424X10-® 

2.0901 
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Figures 


«•  la 


(b) 

Figure  1:  (a)  Left:  cross-section  of  the  newly  prop<»ed  cold  cathode  between  two  emitter  fingers.  Trapping 
of  electrons  by  the  LaS  semimetallic  thin  film  leads  to  a  lateral  current  flow  and  current  crowding  in  the 
structure.  Right:  illustration  of  the  partial  reflection  of  the  two-dimensional  electron  gas  in  the  LaS  thin 
film  upon  entering  the  three-dimensional  contact  regions  where  the  external  bias  is  applied  to  Au  contacts 
made  to  the  thick  LaS  regions,  (b)  Illustration  of  the  multiple  finger  metallic  structure  used  to  bias 
appropriately  a  cold  cathode  with  rectangular  emission  window. 
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(C) 

Figure  1  (cont’d):  (c)  Array  of  circular  cold  cathode  emitters  arranged  in  a  honeycomb  configuration  and 
close-up  view  of  one  of  the  emitters  showing  the  various  layers  in  the  epitaxially  grown  structure  and  the 
contacts  made  to  the  LaS  metallic  grid  and  the  —  InP  heavily  doped  injection  layer.  [After  W.  Friz, 
Task  ELM-2,  "Dissipative  Processes  in  Veiled  Work  Function  Emitters”,  Contract  F  33615-C-95-1755, 
WPAFB,  Dayton,  May  1996]. 
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Thin  Film 


Figure  2:  Schematic  representation  of  the  conduction  band  profile  throughout  the  cold  cathode  emitter 
described  in  the  text.  Under  forward  bias,  a  fraction  of  the  emitted  current  is  captured  in  the  semimetallic 
slab.  The  subsequent  excess  sheet  carrier  concentration  in  the  quantum  well  formed  by  the  semimetallic 
slab  leads  to  a  shift  of  the  fermi  level  in  the  thin  film  which  is  similar  to  a  lowering  of  the  work  function  of 
the  thin  film.  For  a  given  forward  bias,  this  leads  to  an  increase  in  the  electric  field  in  the  wide  bandgap 
semiconductor  (dashed  line  versus  full  line)  and  in  an  increase  in  the  injection  and  emitted  currents.  Also 
shown  in  the  quasi  Fermi  level  spatial  dependence  across  the  wide  bandgap  semiconductor. 
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Figure  3:  Dynamical  shift  of  the  work  function  as  a  function  of  the  external  applied  bias  for  a  cold 
cathode  emitter  with  the  parameters  listed  in  Table  I.  For  each  group  of  curves,  r  =  0.99,  0.9,  and  0.75, 
from  left  to  right.  The  following  physical  parameters  were  used:  Xias  =  300  A,  T\  =  Ti  —  0.5,  and 
vf  =  l-SOvYlO^cm/s. 


Figure  4:  Comparison  between  the  emitted  current  as  a  function  of  the  applied  bias  while  including  (dashed 
line)  and  neglecting  (full  line)  the  dynamical  shift  of  the  work  function  of  the  semiraetallic  slab  described 
in  the  text.  The  parameters  of  the  device  are  listed  in  Table  I.  The  coefficient  C\  and  in  Eq.(l)  were 
chosen  equal  to  1.5X10®  A/V  and  6.9X10’^  (V^/^cm)~\  respectively.  Other  choices  for  the  parameters 
C\  and  C2  with  similar  magnitudes  lead  to  similar  dynamical  shift  of  the  work  function.  The  reflection 
coefficient  r  shown  in  Fig.  1(a)  was  set  equal  to  0.99.  The  following  parameters  were  used:  XcaS  =  300  A, 
Ti  =  T2  =  0.5,  and  vf  =  l.SoiriO^cm/s. 
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-  -  (b) 

Figure  5:  (a)  Bias  dependence  of  the  dynamic  work  function  shift  in  a  typical  cold  cathode  while  varying 
the  reflection  coefficient  of  electrons  between  the  2D  thin  film  and  LaS  contact  regions,  (b)  Correspondin 
bias  dependence  of  emitted  current  density.  The  following  parameters  were  used  (Li  =  500  A,  =  24. 
A,  W  =  1  cm,  Ti  =  Ta  =  0.5,  Xus  =  300  A,  ^{Ag)  =  0.56eV^ ). 
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(b) 

Figure  6:  Same  as  Figure  5  for  a  cathode  with  a  300  A  thick  CdS  region,  all  other  parameters  being  kept 
the  same.  Comparison  with  Figure  5  shows  that  the  exponential  rise  of  the  dynamic  work  function  shift 
and  the  emitted  current  density  occurs  at  smaller  values  of  the  applied  bias. 
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Figure  7:  (a)  Bias  dependence  of  the  dynamic  work  function  shift  in  a  typical  cold  cathode  while  varying  the 
transmission  coefficients  Ti  and  Tj  at  the  CdS/LaS  and  Lo5/vacuxim  interfaces,  (b)  Corresponding  bias 
dependence  of  emitted  current  density.  The  full  and  dashed  lines  are  the  current  density  plots  calculated 
with  and  without  the  effect  of  the  dynamic  work  function  shift.  The  following  parameters  were  used  {Li 
=  300  A,  Lo  =  24.6  .A,  W  =  1  cm,  X^as  =  300  A,  and  =  0.56eV  ).  The  curves  are  labelled  with  the 
values  of  the  transmission  probabilities  Ti  and  T2  which  were  assumed  to  be  identical. 


2-29 


(b) 

Figure  8:  (a)  Bias  dependence  of  the  dynamic  wurk  function  shift  in  a  typical  while  varying  the  electron 
mean  free  path  XiaS-  (b)  Corresponding  bias  dependence  of  emitted  current  density.  The  following 
parameters  were  used  (Li  =  300  k,  =  24.6  A,  W  =  1  cm,  Xus  -  300  A,  ^(*43)  =  0.56eV^,  T\  =Tt  = 
0.5,  and  r  =  0.99). 
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(b) 

Figure  9;  (a)  Fowler-Nordheim  fit  to  the  bias  dependence  of  the  emission  current  of  a  cold  cathode  with 
the  physical  parameters  listed  in  the  caption  of  Figure  5.  (b)  Fit  of  the  same  Jfn  versus  Vjja,  calculation 
by  an  expression  of  the  type  Jfs  =  for  a  range  of  for  which  the  emitted  current  density 

varies  between  1  and  1000  Afcnir. 
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Figure  10:  Fowler-Nordheim  lit  to  the  bias  dependence  of  the  dynamic  work  function  shift  for  a  cold 
cathode  with  the  physical  parameters  listed  in  the  caption  of  Figure  5.  The  following  parameters  were 
used  {Li  =  300  k,  Li  =  24.6  A,  W  =  1  cm,  X^s  =  300  A,  and  A(A,)  =  O.SSeV” ). 
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Figure  1=1:  Bias  dependence  of  the  parameter  c  as  a  hinction  of  applied  bias  for  different  radii  of  a  cold 
cathode  emitter  with  a  circular  geometry.  Cold  cathodes  with  narrower  emitter  window  must  be  opoated 
over  a  smaller  dc  bias  range  to  avoid  current  crowding.  The  following  parameters  were  used  (£x  =  300 
=  24.6  A,  W  =  1cm,  =  300  A,  A(^)  =  0.56«K,  and  r  =  0.99).  Current  crowding  effects  are 
negligible  as  long  as  the  parameter  c  is  kept  under  0.05. 
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Figure  12  :  Radial  variation  of  the  potential  drop  across  the  LaS  windows  of  various  radii.  The  cathode  u 
assumed  to  operate  at  room  temperature  and  the  electrical  resistivity  of  LaS  is  set  equal  to  92  Th( 

potential  is  measured  relative  to  the  potential  at  the  rim  of  the  LaS  circular  window  where  the  electrostatii 
potential  is  assumed  to  be  equal  to  the  applied  bias.  For  each  figure,  the  values  of  the  normalizing  currem 
density  are  from  top  to  bottom  1,  10,  100,  1000  Ajcrr?.  The  LaS  window  radius  is  indicated  on  top  o 
each  figure.  The  following  parameters  were  used  (Li  =  300  A,  =  24.6  A,  W  =  1cm,  =  300  A 
and  ^(Ag)  =  0.56eV  ).  Right  frames  :  corresponding  radial  dependence  of  the  emitted  current  densities 
Current  densities  are  normalized  to  the  values  of  the  emitted  current  at  the  window  rim. 
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Figure  13  ;  Variation  with  the  parameter  c  of  the  different  contributions  to  the  total  power  dissipated  in 
circular  cold  cathodes  with  different  radii  (see  Eqns.(32-3sJ)-  The  physical  parameters  of  the  cathodes  are 
listed  in  Table  I  and  in  the  caption  of  Fig.5.  The  CdS  layer  is  assumed  to  be  grown  on  a  heavily  doped  InP 
substrate  100/im  thick.  The  thermal  conductivity  of  the  InP  substrate  was  taken  equal  to  0.74  W/Kcm 
and  the  parameter  b  in  Eq.^  )  was  set  equal  to  1.45.  The  thickness  of  the  LaS  contact  regions  was  set 
equal  to  100  A. 
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(a) 


(b) 

Figure  :  Power  efBciency  r]p  of  cathodes  of  different  radii  is  plotted  as  a  function  of  (a)  and  (b) 
the  emitted  current  density  For  all  window  sizes,  the  efficiency  decreases  with  and  Jem  as  a 
result  of  current  crowding.  The  efficiency  is  more  or  less  constant  over  a  wider  range  of  for  window 
with  smaller  radius  because  current  crowding  is  less  important  in  that  case,  as  illustrated  in  Fig.  12  .  The 
overall  lower  efficiency  for  smaller  size  window  comes  from  the  fact  that  the  width  b  of  the  thick  LaS 
regions  was  set  equal  to  lOO/im  for  all  cathodes.  The  efficiency  of  cathodes  can  be  increased  by  making 
the  ratio  b/a  in  Fig.l  closer  to  unity.  The  following  parameters  were  used  {Li  =  300  A,  Li  =  24.6  .A,  W 
=  1cm,  Xias  =  300  .4,  A(.4j)  =  0.56eV',  Ti  =Ti  =  0.5,  and  r  =  0.99). 
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Figure  15..  Temperature  rise  in  the  LaS  thin  film  as  a  function  of  the  applied  bias  for  the  various  cold 
cathode  structures  studied  in  Figure  13.  The  active  area  of  the  cathode  (CdS/LaS  layers)  is  assumed  to 

be  grown  on  a  100/im  thick  InP  substrate  with  the  back  of  the  substrate  acting  as  a  perfect  heat  sink 
(300K). 
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Figurel6  :  Emitted  current  densities  for  emitter  window  of  different  radii  as  a  function  of  the  parameter 
c.  The  vertical  line  at  c^O.05  is  the  line  to  the  ri^t  of  which  the  effects  of  current  crowding  not  be 
neglected.  Appreciable  emitter  current  densities  can  be  obtained  for  cathodes  with  emitter  window  width 
under  50/xm.  The  following  parameters  were  used  (It  =  300  A,  Lz  =  24.6  A,  W  =  1cm,  Xus  -  300  A, 
=  0.56eV,  Ti  —Tz  —  0.5,  and  r  =  0.99). 
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CHARACTERISTICS  OF  THE  TEXTURE 
FORMED  DURING  THE  ANNEALING  OF  COPPER  PLATE 

Robert  J.  De  Angelis 
Professor 

Department  of  Mechanical  Engineering 
University  of  Nebraska-Lincoln 

Abstract 

The  production  of  copper  plate  with  controlled  degrees  of  anisotropy  is  important  because  a  regulated  texture 
provides  significant  assurance  that  subsequent  plastic  deformation  can  be  performed  to  successfiillyproduce  a  reliable 
product.  To  proceed  to  this  condition,  the  degree  of  anisotropy  in  a  material  must  be  quantified  in  the  cold  worked 
state  and  at  a  number  of  intervals  during  the  recrystallization  process. 

In  this  investigation  the  recrystallization  kinetics  of  copper  plate  was  monitored  by  determination  of  the 
microhardness,  the  energy  released  and  the  microstructure  during  annealing.  The  texture  was  determined,  by  x-ray 
pole  figure  methods.  The  degree  of  anisotropy  was  inferred  by  calculating  the  orientation  distribution  functions 
(ODF)  from  the  pole  figure  data. 

One  of  the  main  objectives  of  this  research  was  to  quantify  the  changes  in  texture  occurring  during  the  annealing 
of  a  cold  worked  copper  plate. 

A  finite  element  model  of  a  30  caliber  cylindrical  copper  shell  shot  at  a  rigid  wall  at  541.6  ft/sec  has  been 
developed.  The  post  mortem  radial  profiles  of  copper  specimens  fired  at  570.9  ft/s  show  excellent  correlation  with 
the  shape  of  the  projectile  predicted  by  the  finite  element  model  after  impact.  Contour  plots  of  Von-Mises  stress 
and  effective  plastic  strain  after  impact  have  also  been  created.  These  results  indicate  finite  element  modeling  is  an 
effective  way  to  predict  final  deformed  shape  of  these  types  of  ballistic  impacts. 
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CHARACTERISTICS  OF  THE  TEXTURE 
FORMED  DURING  THE  ANNEALING  OF  COPPER  PLATE 

Robert  J.  De  Angelis 

Introduction: 

Annealing  phenomena  have  both  a  scientific  and  engineering  importance  because  these  phenomena  play 
important  roles  in  the  formation  of  the  microstructure  and  in  determining  the  engineering  properties  of  the  processed 
metallic  materials.  Aimealing  or  softening  of  the  cold  worked  or  hardened  metals  has  been  described  by  two 
mechanisms  recovery  and  recrystallization.  These  may  occur  separately  or  sequentially  depending  on  the  initial  grain 
size,  the  degree  of  cold  work,  and  the  annealing  thermal  cycle.  Recovery  is  associated  with  the  annihilation  of 
crystalline  defects  and  the  migration  of  dislocations  into  arrays  which  form  small  angle  boundaries.  Reciystallization 
takes  place  by  the  nucleation  and  growth  of  new  strain  free  grains  at  the  expense  of  the  deformed  matrix. 

The  recrystallization  of  copper  was  investigated  in  the  late  1940’s  (1,2)  through  the  observation  of  changes  in 
mechanical  properties  and  the  integrated  intensity  of  the  (200)  Bragg  diffraction  peak  during  isothermal  annealing. 
In  addition  to  the  usual  variables  of  amount  of  cold  work  and  annealing  times  and  temperatures.  Cook  and  Richards 
(1)  investigated  the  role  of  the  initial  grain  size  on  the  recrystallization  behavior  of  copper.  They  reported  that 
copper  (containing  0.04  oxygen,  0.003  silver  and  0.001%  iron)  with  grain  sizes  of  0.015  mm,  0.025  mm  and  0.06 
mm  cold  rolled  to  97.5%  reduction  demonstrated  times  for  50%  recrystallization  at  125°C  of  0.3  hrs.,  0.7  hrs.  and 
greater  than  8000  hrs  respectively.  This  result  clearly  indicates  the  necessity  to  characterize  the  initial  grain  size 
prior  to  cold  working. 

Unquestionably,  recrystallization  produces  a  change  in  the  texture  of  a  polycrystalline  material.  This  change 
from  the  cold  worked  to  the  annealed  texture  is  accompanied  by  a  corresponding  change  in  engineering  properties 
and  is  the  prime  cause  of  anisotropy  in  polycrystalline  metallic  materials.  Texture  or  preferred  orientation  imparts 
the  anisotropic  properties  of  the  single  crystal  to  the  polycrystalline  aggregate. 

Despite  the  general  recognition  that  detailed  texture  description  is  needed  to  control  macroscopic  properties  very 
few  investigations  of  materials  processing  have  included  the  quantification  of  texture.  Since  the  1980’s  the 
description  of  material  textures  or  crystal  orientations  in  polycrystalline  wires  and  sheets  started  to  move  beyond  the 
crystallographic  pole  figure  representation  of  texture  (3).  In  recent  years  the  orientation  distribution  function  (ODF) 
has  become  the  method  of  choice  for  presenting  the  description  of  material  textures  (3,4).  A  great  advantage  of  the 
ODF  method  of  texture  representation  is  that  the  coefficients  of  the  harmonic  equations  employed  to  describe  the 
function  provide  weighing  factors  for  the  determination  of  the  anisotropic  elastic  and  plastic  properties  of  the 
material. 

Also  included  in  this  report  are  the  results  of  a  dynamic  nonlinear  finite  element  analysis  of  the  firing  of  a  30 
caliber  cylindrical  copper  shell  at  a  rigid  wall.  The  finite  element  code  utilized  for  the  analysis  was  LS-Dyna3d  as 
it  provides  excellent  nonlinear  material  behavior.  The  objective  of  this  analysis  is  to  compare  the  shape  following 
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impact  predicted  from  the  finite  element  method  with  experimentally  observed  shapes  of  projectiles  that  impacted 
a  stationary  wall. 

Materials  Processing: 

Nebraska  Plate:  A  0.375  inch  thick  copper  plate  was  produced  from  a  one  inch  thick  pancake  by  cold  rolling, 
employing  a  clockwise  rotation  of  1 35°  between  passes.  The  copper  pancake  was  made  by  upset  forging  a  three  inch 
diameter,  three  inch  long  bar  cut  from  a  hot  rolled  three  inch  thick  slab.  One  half  of  the  as  cold  rolled  plate  was 
provided  by  Mr.  Joel  W.  House  of  Wright  Laboratory  (AWEF)  at  Eglin  Air  Force  Base.  This  one-half  section  of 
material  shown  in  Fig.  1  will  be  referred  to  as  the  Nebraska  plate. 

ATK  Plate:  A  second  plate  provided  by  Mr.  Joel  House  was  processes  by  Allianttech  Systems  and  was  from  Lot 
A2365.  The  alloy  is  ClOlOO  copper  supplied  by  the  mill  as  3.5  inch  diameter  bar.  The  bar  was  annealed  at  343°C 
(650°F)  for  one  hour.  Billets  of  2.6  inch  length  were  cold  upset  forged  to  0.39  to  0.35  inch  thick  plate  (strains  of - 
1.9  to  -2.0)  in  five  hammer  blows.  After  annealing  at  343°C  (650°F)  for  one  hour  the  copper  plate  had  a  grain  size 
of  10  to  15  microns  with  a  few  isolated  20  micron  grains.  An  approximately  six  inch  diameter  one  inch  high  dome 
was  cold  formed  into  the  center  of  the  plate  leaving  a  one  and  a  half  inch  diameter  rim  around  the  dome.  This  plate 
will  be  referred  to  as  the  ATK  plate.  Testing  on  this  plate  was  performed  at  Eglin  AFB  AWEF  facilities  during  the 
summer  of  1996. 

Experimental  Procedures: 

The  experimental  and  analytical  techniques  employed  in  this  investigation  can  be  partitioned  into  five  task  areas: 
Materials  processing;  X-ray  pole  figure  determinations;  Thermal  analysis;  Microstructural  characterization, 
Mechanical  Property  Determination  and  Finite  Element  Modeling  of  impact  behavior  of  copper.  These  tasks  were 
executed  at  the  University  of  Nebraska-Lincoln  (UN-L)  and  at  Wright  Laboratories,  Eglin  Air  Force  Base,  AWEF 
(WL/MNMW).  This  division  of  the  experimental  efforts  effectively  took  advantage  of  the  expertise  and  equipment 
existing  at  both  locations.  The  hardness  measurements,  the  thermal  analysis  experiments,  the  metallography 
observations  and  the  finite  element  modeling  of  the  impact  behavior  of  copper  were  performed  at  the  UN-L.  The 
x-ray  pole  figure  determinations,  the  mechanical  property  determinations  and  the  30  caliber  gun  impact  tests  were 
performed  at  WL/MNMW. 

Nebraska  Plate:  Metallographic,  x-ray,  and  mechanical  test  specimens  were  machined  from  the  Nebraska  plate  at 
1/4,  3/4  and  4/4  radial  positions.  These  specimens  were  located  in  the  plate  half  section  such  that  their  radial  center 
lines  coincided  with  zero,  forty  five  and  ninety  degrees  to  the  cut  surface.  These  nine  specimens,  plus  the  specimen 
from  the  center,  were  the  ten  locations  in  the  plate  where  structural  determinations  were  made.  The  specimen  layout 
in  the  Nebraska  plate  is  shown  in  Fig.  2. 

ATK  Plate:  The  rim  of  the  ATK  plate  was  basically  not  effected  by  the  cold  deformation  during  the  formation  of 
the  dome  in  the  ATK  plate.  The  rim  material  provided  specimens  for  metallography,  x-ray  diffraction,  quasi-static. 
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high  strain  rate  mechanical  testing  and  Taylor  impact  tests  from  the  positions  shown  in  Fig.  3. 

Heat  Treatment: 

Nebraska  Plate:  Samples  from  several  positions  in  the  Nebraska  plate  were  heat  treated  in  flowing  argon  at 
temperatures  between  260°C  and  300°C  for  various  times.  These  samples  were  employed  for  optical  microstructure 
investigation.  The  specimens  employed  for  pole  figure  determinations  were  wrapped  in  tantalum  foil  and  vacuum 
annealed  at  300°C  for  one  hour. 

Microstructural  Characterization: 

Nebraska  Plate:  Optical  and  scanning  electron  microscopy  (SEM)  were  employed  to  characterize  the  microstructure 
of  the  copper  plate  prior  to  cold  rolling,  in  the  cold  rolled  condition  and  at  each  of  the  annealing  intervals  at  each 
temperature.  Employing  the  Kohlhoff  etching  technique,  which  reveals  (111)  planes  and  shear  bands,  permits,  by 
SEM  examination,  the  qualitative  identification  of  grain  orientation  relationships  (5). 

X-Ray  Pole  Figures: 

Nebraska  Plate:  X-ray  pole  figures  were  determined  using  the  Siemens  diffractometer  at  WL/MNMW.  The 
specimens  were  prepared  and  the  data  was  collected  by  the  author  and  Mr.  Todd  Snyder,  a  UN-L  graduate  research 
assistant.  Pole  figures  were  determined  at  the  ten  locations  on  the  Nebraska  plate  as  shown  in  Fig.  4.  These  ten 
samples  were  split  near  to  mid-plane  and  milled  flat.  The  midplane  surface  was  prepared  for  x-ray  investigation  by 
metallographically  polishing  and  etching.  The  (111),  (200)  and  (220)  pole  figures  were  collected  from  the  midplane 
surface  of  the  ten  specimens.  The  pole  figure  data  was  transformed  to  ODFs  employing  both  POPLA  and  Siemens 
software.  The  specimens  were  vacuum  annealed  at  300°C  for  one  hour.  The  identical  surfaces  were  prepared  as 
described  above  and  the  (1 1 1),  (200)  and  (220)  pole  figures  were  determined  on  the  ten  annealed  specimens.  The 
ODFs  of  the  annealed  specimens  were  calculated  from  the  pole  figure  data. 

ATK  Plate:  Pole  figures  of  the  (1 1 1),  (200)  and  (220)  planes  were  determined  for  the  ATK  plate.  These  data  were 
employed  to  calculate  the  ODF  of  this  plate  and  the  inverse  pole  figures. 

Thermal  Analysis: 

Nebraska  Plate:  The  rate  and  amount  of  energy  release  was  investigated  using  differential  thermal  analysis  (DTA) 
and  differential  scanning  calorimetry  (DSC).  Isothermal  and  scanning  modes  were  utilized  to  monitor  the  fraction 
of  stored  energy  released.  This  information  was  used  to  provide  insights  into  the  kinetics  of  the  recrystallization 
process. 

Finite  Element  Model: 

The  solid  finite  element  model  of  the  30  caliber  copper  projectile  was  created  using  Altair’s  Hypermesh  mesh 
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generator  version  2.0e.  The  projectile  was  0.9  inches  long  with  a  diameter  of  0.289  inches.  The  average  element 
size  has  a  length  of  0.02  inches.  Located  0.01  inches  from  the  end  of  the  projectile  was  a  rigid  wall  that  is  fixed 
in  space.  The  friction  existing  between  the  wall  and  the  projectile  was  assumed  to  be  zero  in  the  simulation.  The 
full  model  consisted  of  6360  elements  and  8150  nodes.  The  initial  geometry  and  setup  of  the  simulation  is  shown 
in  Figs.  5  and  6.  The  projectile  was  given  an  initial  velocity  of  541.6  in/sec  in  a  direction  toward  the  wall.  The 
simulation  ran  for  10  milliseconds  of  impact  time  which  took  21 .661  minutes  of  CPU  time  on  a  Cray  J  916  computer 
using  fully  integrated  solids  on  the  model. 

The  material  behavior  of  the  copper  projectile  was  determined  using  a  piecewise  linear  plasticity  constitutive 
model.  This  model  permitted  for  the  input  of  the  mass  density  of  copper  and  the  values  of  it’s  Poisson’s  ratio,  elastic 
modulus,  and  yield  strength.  Table  I  contains  the  bulk  material  property  values  that  were  employed  in  the  simulation. 
The  strain  rate  sensitivity  of  the  material  was  introduced  into  the  model  by  the  creation  of  a  family  of  stress  versus 
effective  plastic  strain  curves  for  various  strain  rates.  Here,  two  experimentally  determined  stress-strain  curves  for 
copper  were  employed  at  strain  rates  of:  0.0222  per  second  and  1777.58  per  second.  Values  between  these  two 
strain  rates  were  linearly  interpolated.  The  experimentally  determined  relationships  between  stress  and  effective 
plastic  strain  for  the  copper  material  at  the  two  strain  rates  is  shown  in  Fig.  7. 


Table  /  -  Bulk  Material  Properties  of  Copper. 


Material 

Density 

(Ib^s*)/*"^ 

Poisson’s  Ratio 

Elastic  Modulus 
(psi) 

Yield  Strength 
(psi) 

Copper 

8.336E-04 

0.30 

16.0E+06 

l.OOE+04 

Results  and  Discussion: 

Heat  Treatment: 

Optical  metallography  of  the  isothermally  annealed  as-rolled  copper  (CU-HR7-AR)  specimens  indicated 
approximately  90%  recrystallization  after  annealing  at  260  ’C  for  20  hr.  Increasing  the  temperature  1 0°C  to  270'’C 
and  annealing  for  20  hr.  resulted  in  full  recrystallization.  These  observations  are  confirmed  by  microhardness 
measurements  which  indicate  a  softening  transition  in  the  temperature  range  between  230°C  and  270°C. 

Microstructural  Characterization: 

Nebraska  Plate:  As-rolled  and  heat-treated  specimens  were  examined  at  representative  positions  along  the  three 
sampling  directions.  The  as-rolled  microstructure  was  typical  of  heavily  cold-worked  material,  with  no  obvious 
distinction  among  samples  taken  form  the  three  plate  directions.  For  local  texture  measurement  a  crystallographic 
etching  technique  (5)  (concentrated  nitric  acid  attacks  {111}  planes  most  slowly)  indicates  differences  in  the 
directions  parallel  and  perpendicular  to  the  rolling  direction. 
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The  microstructure  of  the  heat  treated  samples  were  partially  recrystallized  at  all  annealing  temperatures  between 
260°  and  300°C.  Large  (50  /zm)  elongated  grains  were  observed  after  annealing  at  all  temperatures.  The 
microstructure  of  specimens  heat  treated  under  various  conditions  were  investigated  to  aid  understanding  of  the 
recrystallization  process  and  kinetic  behavior  and  to  help  evaluate  the  texture  development.  An  optical  micrograph 
of  a  typical  recrystallized  microstructure  is  shown  in  Fig.  8.  Table  II  contains  a  summary  description  of  the 
microstructure  observations  of  specimens  annealed  to  various  conditions. 


Tab/e  II  -  Summary  of  Microstructural  Observations. 


Temperature  (°C)/ 
Time  (h) 

Avg.  Recrystallized 
Grain  Size  (/tm) 

Approximate  Percent 
Recrystallized 

Comments 

260/  1 

3 

<10 

sparsely  located  grains 

260  /  20 

12 

58 

large  regions  of  unrecrystallized 
metal 

270  /  1 

8 

15 

unrecrystallized  region  elongated  in 
radial  direction 

270  /  20 

14 

78 

- 

280  /  1 

12 

22 

- 

280  /  20 

18 

83 

large  unrecrystallized  regions  remain 

290  /  1 

22 

90 

- 

290  /  20 

- 

- 

- 

300  /  1 

30 

100 

- 

SEM  micrographs  of  an  as-rolled  specimen  and  a  specimen  vacuum  annealed  for  one  hour  at  300°C  which  were 
etched  in  concentrated  nitric  acid  is  shown  in  Fig.  9.  Etching  the  radial  direction  of  the  as-rolled  specimen  revealed 
an  organization  of  lamellar  volumes  of  similar  orientation.  In  addition,  deformation  banding  was  observed  to  be 
extensive  as  shown  in  Fig.  10.  These  deformation  bands  produce  regions  of  high  lattice  curvature,  which  in  turn 
strongly  influence  the  nucleation  and  growth  of  recrystallized  grains. 

X-Ray  Pole  Figures: 

Nebraska  Plate:  The  pole  figures  from  all  positions  except  for  the  center  position  of  the  Nebraska  plate  showed  the 
cold  worked  textures  to  be  mainly  a  combination  of  (200)  and  (220)  wire  textures.  This  texture  combination  reduced 
the  (111)  component  at  the  center  of  the  cold  worked  plate  to  almost  zero.  The  specimen  taken  from  the  center 
position  had  a  much  weaker  (220)  wire  texture  in  the  cold  worked  condition,  and  a  generally  weaker  texture  in  the 
annealed  condition.  The  pole  figures  on  the  annealed  center  specimen  were  statistically  less  reliable  due  to  the  large 
grain  size  that  developed  during  annealing.  Presumably  this  was  because  the  material  in  the  center  of  the  plate  was 
subjected  to  much  less  deformation  during  the  upset  forging  and  subsequent  rolling  procedure. 

Differences  between  predominately  wire  textures  that  occurred  during  annealing  can  be  demonstrated  by  plotting 
the  integrated  intensity  of  the  annealed  condition  at  any  chi  tilt  angle  minus  the  integrated  intensity  of  the  cold 
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worked  condition  at  the  same  chi  angle.  This  difference  in  pole  figures  is  shown  for  the  specimen  taken  from  the 
center  of  the  plate  in  Fig.  11. 

The  changes  in  crystallographic  orientation  during  annealing  were  averaged  for  the  0°,  45°,  and  90°  specimens 
at  each  radial  position  of  1/4R,  3/4R  and  4/4R.  The  average  change  in  the  (1 1 1),  (200),  and  (220)  pole  densities 
at  the  three  radial  positions  as  a  function  of  chi  tilt  angle  are  shown  in  Figs.  12,  13  and  14.  It  is  apparent  from  these 
results  that  the  annealing  protocol  combined  with  the  rolling  procedure  to  produce  the  as-deformed  microstructure, 
though  very  distinct  at  various  positions  in  the  plate,  formed  similar  recrystallization  textures  during  annealing. 

The  pole  figures  of  the  cold  worked  materials  showed  significant  variations  in  texture  as  the  radial  position 
increased.  The  as-deformed  textures  had  (220)  pole  density  maximums  ranging  from  2.67  to  15.88.  Vacuum 
annealing  the  specimens  for  one  hour  at  300°C  resulted  in  very  similar  texture  for  all  the  specimens.  These  results 
were  very  unexpected  and  lead  to  the  very  important  possibility  that  the  annealing  thermal  treatment  determines  the 
final  recrystallization  texture  of  the  cold  worked  copper  plate. 

ATK  Plate:  The  x-ray  specimen  taken  from  the  plate  position  shown  in  Fig.  3  was  prepared  for  x-ray  diffraction 
by  polishing  and  etching  as  described  in  the  previous  section  for  the  Nebraska  plate.  The  resulting  pole  figures 
indicated  the  texture  to  be  (220)  pure  wire  with  a  maximum  strength  of  2.53  times  random.  The  (1 1 1)  pole  figure 
had  a  strength  of  1 .73  time  random  distributed  uniformly  on  a  circle  centered  35°  from  the  center  of  the  pole  figure. 
An  attribute  of  the  texture  to  be  emphasized  is  the  remarkable  uniformity  of  the  wire  texture  distribution.  This,  along 
with  the  uniform  nature  of  the  texture,  is  due  to  the  processing  history,  the  cold  upsetting  a  rolled  bar  into  plate. 

Thermal  Analysis: 

Nebraska  Plate:  Attempts  made  to  determine  the  amounts  and  rates  of  energy  released  during  annealing  of  the 
plate  processed  copper  specimens  were  not  completely  successful.  These  measurements  were  to  allow,  in  addition 
to  other  materials  parameters,  the  determination  of  the  activation  energy  for  recrystallization.  The  rates  and  amounts 
of  energy  released  were  investigated  by  employing  a  differential  thermal  analysis  system  (Perkin-Elmer  DTA-7)  and 
differential  scanning  calorimetry  unit  (Perkin-Elmer  Delta  Series  DSC-7).  Recrystallization  information  obtained 
from  the  annealing,  microstructure  and  hardness  investigation  was  employed  to  select  the  temperature  and  time 
parameters  for  the  isothermal  and  scanning  experimental  modes  of  thermal  analysis. 

Initial  isothermal  investigations  were  performed  on  the  DTA-7  which  measures  a  differential  temperature 
between  a  reference  cup  and  a  sample  cup.  Isothermal  temperatures  between  260°C  and  290°C  were  investigated 
with  time  periods  of  30  hours.  No  indication  of  stored-energy  release  was  observed  in  these  experiments.  However, 
post  microstructural  observations  indicated  recrystallization  had  taken  place.  Similar  results  were  observed  in  the 
scanning  mode  thermal  studies  conducted  at  scan  rates  of  5,  10,  and  20°C  /min.  To  increase  the  sensitivity  of  the 
measurements,  a  differential  scanning  calorimeter  (DSC)  was  used  for  scanning  mode  studies  with  a  temperature  span 
starting  at  room  temperature  and  ending  at  700°C.  These  experiments  produced  incomplete  energy  release  data.  The 
energy  release  would  go  through  a  maximum,  but  the  rate  of  energy  release  would  not  return  to  the  baseline  value. 


4-8 


These  results  were  of  little  quantitative  or  qualitative  value.  A  significant  amount  of  effort  was  invested  into  the 
thermal  analysis  experimentation;  however,  no  definite  conclusions  can  be  extracted  from  the  data. 

Hardness  Measurements: 

Nebraska  Plate:  Rockwell  B  hardness  traverses  were  taken  along  OA,  OB,  OC,  and  OD.  The  individual  hardness 
determinations  made  one  half  inch  apart  along  the  traverses  are  plotted  in  Fig.  15.  There  is  some  scatter  in  the  data; 
however  it  is  apparent  that  the  plate  is  slightly  harder  (RHB  62)  at  the  edge  and  slightly  softer  (RHB  55)  in  the 
center  of  the  plate.  Table  III  shows  the  statistics  for  each  traverse. 


Table  III  -  Average  Rockwell  B  Hardness  of  Traverses  A,  B.  C,  and  D. 


Traverse 

Mean 

Hardness 

Std.  Dev. 

A 

56.4 

2.6 

B 

55.7 

4.4 

C 

55.8 

3.3 

D 

57.4 

3.0 

Microhardness,  measurements  on  a  Wilson  Tukon  Series  200  machine,  of  specimens  annealed  for  one  hour  at 
temperatures  between  230°C  and  330®C  indicated  that  significant  softening  occurs  at  temperatures  above  230°C. 
Annealing  at  260°C  for  1  hour  produced  the  largest  decrease  in  hardness.  Increasing  the  annealing  temperatures 
produced  material  with  hardnesses  slightly  higher  than  the  values  observed  after  annealing  at  260®C. 

ATK  Plate:  Rockwell  F  hardness  data  were  taken  on  the  “as  received  “  ATK  plate  at  the  positions  shown  in  Fig. 
16.  The  hardness  measured  in  the  rim  of  the  plate  were  about  Rf  55,  same  as  the  hardness  of  the  annealed  Nebraska 
plate  contained  in  Table  III.  The  hardness  increases  rapidly  at  the  start  of  the  dome  were  the  bending  is  extensive. 
In  the  dome  section  of  the  ATK  plate  the  increase  in  hardness  is  gradual  from  Rp  56  at  the  base  of  the  dome  to  Rp 
83  at  the  apex  of  the  dome.  These  results  allow  the  specimens  from  the  rim  section  to  be  considered  to  be  annealed. 

Mechanical  Properties: 

The  mechanical  properties  determined  on  the  plate  materials  were  performed  at  low  strain  rate  (Instron  Machine), 
high  strain  rate  (Hopkinson  Bar)  and  30  caliber  ballistic  tests  (Taylor  Impact).  These  tests  were  performed  on  the 
ATK  plate,  however  only  the  Taylor  impact  tests  were  performed  on  the  Nebraska  plate.  The  results  of  the  other 
two  types  of  mechanical  tests,  reported  here,  were  obtained  from  a  copper  plate  processed  on  exactly  the  same 
schedule  as  the  Nebraska  plate  referred  to  as  the  MSC  plate. 

Nebraska  Plate:  Three  Taylor  impact  test  specimens  were  machined  from  the  Nebraska  plate  at  zero  degrees,  45° 
and  90°  to  the  minor  axis  of  the  plate  (see  Fig.  2).  These  specimens  were  tested  at  Eglin  AFB,  AWEF  resulting  in 
the  data  shown  in  Table  IV. 
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Table  IV  -  Data  from  Taylor  Impact  Specimens. 


SHOT  # 

MATERIAL 

L/D 

RATIO 

IMPACT 

VELOCITY 

FINAL 

LENGTH 

L-NOT 

DEFORM 

ECCENTRICITY 

SC-5 

MSC-16 

3.0 

738  ft/s 

0.536  in 

0.000  in 

0.314 

SC-6 

MSC-16 

3.0 

656 

0.591 

0.000 

0.283 

SC-78 

MSC-16 

3.0 

561 

0.638 

0.000 

0.267 

SC-79 

MSC-16 

3.0 

571 

0.657 

0.000 

0.293 

SC-82 

ATK-T-90 

3.0 

600 

0.635 

0.000 

0.201 

SC-83 

ATK-T-45 

3.0 

571 

0.646 

0.000 

0.206 

SC-84 

NEB-T-0 

3.0 

581 

0.626 

0.000 

0.130 

SC-85 

NEB-T-90 

3.0 

571 

0.635 

0.000 

0.191 

SC-86 

NEB-T-45 

3.0 

581 

0.629 

0.000 

0.289 

In  addition  to  the  data  included  in  the  table  above  Mr.  Joel  House,  Eglin  Air  Force  Base,  AWEF  measured  the 
radii  of  the  specimens,  after  impact,  as  a  function  of  position  along  its  length.  Data  from  two  of  the  specimens  SC- 
79  and  SC-82  were  employed  to  determine  the  validity  of  the  finite  element  model. 

ATK  Plate:  The  specimen  locations  for  the  Instron,  Hopkinson  bar  and  Taylor  test  mechanical  test  was  shown  in 
Fig.  3.  All  sixteen  of  the  0°  and  90°  static  and  high  strain  rate  specimens  were  tested  at  AWEF.  The  data  obtained 
from  the  Instrion  and  Hopkinson  bar  tests  are  shown  in  Figs.  17a  to  17f  and  18a  to  18h. 

Finite  Element  Model; 

Analysis  of  the  data  obtained  from  the  finite  element  model  of  a  copper  cylindrical  projectile  impacting  a  rigid 
wall  at  541.6  ft/sec  shows  good  correlation  with  actual  testing  data  obtained  for  shots  fired  at  570.9  ft/sec.  Two 
views  of  the  geometry  of  the  copper  projectile  after  impact  are  shown  in  Figs.  19  and  20.  A  comparison  of  profiles 
of  the  radii  along  the  length  of  the  projectile  after  impact  for  two  experimental  tests  and  for  the  computer  simulation 
are  shown  in  Fig.  2 1 .  The  two  experimental  tests  are  designated  SC79  and  SC83  were  copper  specimens  machined 
from  MSC-Plate  16  and  ATK-T-90  respectively.  Both  specimens  had  a  anvil  striking  velocity  of  570.9  ft/sec.  While 
the  difference  in  striking  velocities  for  the  finite  element  model  and  the  actual  tests  are  somewhat  different  they  are 
close  enough  to  draw  qualitative  comparisons.  The  data  contained  in  Fig.  21  indicated  the  computer  simulation 
profile  compares  very  favorably  with  the  data  from  the  specimens  undergoing  the  actual  impact.  The  Von-Mises 
stress  contour  plot  and  a  effective  plastic  strain  contour  plot  through  the  middle  of  the  projectile  after  impact  are 
shown  in  Figs.  22  and  23.  These  data  indicate  how  the  material  flowed  during  plastic  deformation  following  impact 
by  tracking  the  material  regions  subjected  to  either  high  or  low  values  of  stress.  Figures  9  and  10  also  show  a  The 
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Von-Mises  stress  contour  plot  and  effective  plastic  strain  plot  at  the  impact  end  of  the  cylinder,  the  end  that  struck 
the  wall,  are  shown  in  Figs.  24  and  25.  These  plots  are  also  useful  in  understanding  how  the  material  flowed  under 
impact. 

Conclusions 

An  initial  study  has  been  performed  to  understand  the  correlation  between  the  forming  processes  (die  upsetting 
and  rolling)  and  the  recrystallization  texture  of  copper  plate.  An  experimental  plan  consisting  of  materials 
processing;  microstructural  characterization;  mechanical  properties;  x-ray  pole  figure  determinations;  finite  element 
simulations  of  the  impacting  of  copper  and  thermal  analysis  was  used  to  characterize  the  copper  plate.  Annealing 
and  microstructural  studies  indicate  complete  recrystallization  after  1  hour  at  300°C.  Additional  data  accumulated 
in  thermal  analysis  experiments  concerning  the  amount  of  stored  energy  and  activation  energies  for  recrystallization 
are  inconclusive  at  this  point. 

X-ray  pole  figures  of  the  as-rolled  specimens  showed  significant  variations  in  texture  as  the  radial  position 
increased  (i.e.,  nearer  the  edge  of  the  plate).  Annealing  the  specimens  resulted  in  recrystallization  textures  similar 
for  each  specimen.  These  unexpected  results  indicate  that  the  annealing  procedure  determines  the  final 
recrystallization  texture,  while  the  cold  rolling  procedure  has  a  secondary  effect. 

A  finite  element  model  of  a  cylindrical  copper  projectile  striking  a  rigid  wall  was  constructed  and  tested.  The 
radial  shape  profile  after  impact  predicted  by  the  simulation  correlated  extremely  well  experimental  projectile  shapes. 
The  stress  and  strain  data  resulting  from  the  simulation  are  an  aid  in  understand  the  material  flow  during  the 
deformation  associated  with  the  impact  event. 
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Figure  Captions 

Fig.  1  The  one-half  section  of  copper  plate  which  is  referred  to  as  the  Nebraska  plate. 

Fig.  2  The  layout  geometry  of  the  specimens  cut  from  the  Nebraska  plate. 

Fig.  3  The  specimen  layout  in  the  rim  of  the  annealed  Allianttech  Systems  domed  ClOlOO  copper  plate  from 

Lot  A2365.  The  rim  material  provided  specimens  for  metallography,  x-ray  diffraction,  quasi-static,  high 
strain  rate  mechanical  testing  and  Taylor  impact  tests  from  the  positions  indicated. 

Fig.  4  The  location  of  the  ten  pole  figures  specimen  in  the  Nebraska  plate. 

Fig.  5  The  initial  geometry  of  the  finite  element  simulation.  The  solid  finite  element  model  of  the  30  caliber 
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Fig.  6 

Fig.7 

Fig.  8 
Fig.  9 

Fig.  10 
Fig.  11 

Fig.  12 

Fig.  13 

Fig.  14 

Fig.  15 
Fig.  16 
Fig.  17a 

Fig.  17b 

Fig.  17c 


copper  projectile  0.9  inches  long  with  a  diameter  of  0.289  inches.  The  average  element  size  has  a  length 
of  0.02  inches. 

The  set  up  of  the  finite  element  simulation.  The  30  caliber  copper  projectile  is  located  0.01  inches  from 
a  rigid  wall  that  is  fixed  in  space.  The  friction  existing  between  the  wall  and  the  projectile  was  assumed 
to  be  zero.  The  full  model  consisted  of  6360  elements  and  8150  nodes. 

Stress  versus  effective  plastic  strain  for  the  compression  testing  of  MSC  copper  plate  material  at  the  strain 
rates  of  0.002  and  500  per  second.  The  MSC  plate  is  extremely  similar  to  the  Nebraska  plate. 

An  optical  micrograph  of  copper  showing  a  typical  recrystallized  microstructure. 

SEM  micrographs  of  copper  in  the  as-rolled  condition  and  after  vacuum  annealing  for  one  hour  at  300°C. 
Specimens  were  etched  in  concentrated  nitric  acid. 

An  optical  micrograph  of  copper  showing  deformation  banding. 

The  differences  between  the  integrated  intensity  of  the  (111),  (200),  and  (220)  pole  densities  of  the 
annealed  condition  as  a  function  of  chi  angle  minus  the  integrated  intensity  of  the  cold  worked  condition 
at  the  same  chi  angle.  These  differences  in  pole  figures  are  shown  for  the  specimen  taken  from  the  center 
of  the  Nebraska  plate. 

The  average  changes  in  the  differences  between  the  integrated  intensity  of  the  ( 1 1 1 ),  (200),  and  (220)  pole 
densities  of  the  annealed  condition  as  a  function  of  chi  angle  minus  the  integrated  intensity  of  the  cold 
worked  condition  at  the  same  chi  angle.  These  differences  in  pole  figures  were  averaged  over  the  three 
specimens  taken  from  the  one  quarter  radial  position  of  the  Nebraska  plate. 

The  average  changes  in  the  differences  between  the  integrated  intensity  of  the  ( 1 1 1 ),  (200),  and  (220)  pole 
densities  of  the  annealed  condition  as  a  function  of  chi  angle  minus  the  integrated  intensity  of  the  cold 
worked  condition  at  the  same  chi  angle.  These  differences  in  pole  figures  were  averaged  over  the  three 
specimens  taken  from  the  three-quarter  radial  position  of  the  Nebraska  plate. 

The  average  changes  in  the  differences  between  the  integrated  intensity  of  the  ( 1 1 1 ),  (200),  and  (220)  pole 
densities  of  the  annealed  condition  as  a  function  of  chi  angle  minus  the  integrated  intensity  of  the  cold 
worked  condition  at  the  same  chi  angle.  These  differences  in  pole  figures  were  averaged  over  the  three 
specimens  taken  from  the  edge  position  of  the  Nebraska  plate. 

Rockwell  B  hardness  data  taken  at  one  half  inch  intervals  on  the  Nebraska  plate  on  the  traverses  indicated. 
Rockwell  F  hardness  data  were  taken  on  the  “as  received”  ATK  plate  at  the  positions  indicated. 

Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on  longitudinal  specimen 
from  the  ATK  plate  at  the  orientation  of  0°. 

Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on  longitudinal  specimen 
from  the  ATK  plate  at  the  orientation  of  0°. 

Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on  longitudinal  specimen 
from  the  ATK  plate  at  the  orientation  of  90°. 
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Fig.  17d  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on  longitudinal  specimen 
from  the  ATK  plate  at  the  orientation  of  90°. 

Fig.  17e  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on  radial  specimen  from 
the  ATK  plate  at  the  orientation  of  0°. 

Fig.  17f  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on  radial  specimen  from 
the  ATK  plate  at  the  orientation  of  90°. 

Fig.  18a  Hopkinson  bar  compression  stress-strain  data  from  test  58  obtained  on  a  longitudinal  specimen  from  the 
ATK  plate  at  the  orientation  of  0°  at  the  strain  rate  of  1748/s. 

Fig.  18b  Hopkinson  bar  compression  stress-strain  data  from  test  59  obtained  on  a  longitudinal  specimen  from  the 
ATK  plate  at  the  orientation  of  0°  at  the  strain  rate  of  1749/s. 

Fig.  18c  Hopkinson  bar  compression  stress-strain  data  from  test  62  obtained  on  a  longitudinal  specimen  from  the 
ATK  plate  at  the  orientation  of  90°  at  the  strain  rate  of  1 803/s. 

Fig.  18d  Hopkinson  bar  compression  stress-strain  data  from  test  63  obtained  on  a  longitudinal  specimen  from  the 
ATK  plate  at  the  orientation  of  90°  at  the  strain  rate  of  1763/s. 

Fig.  1 8e  Hopkinson  bar  compression  stress-strain  data  from  test  60  obtained  on  a  radial  specimen  from  the  ATK 
plate  at  the  orientation  of  0°  at  the  strain  rate  of  1734/s. 

Fig.  18f  Hopkinson  bar  compression  stress-strain  data  from  test  61  obtained  on  a  radial  specimen  from  the  ATK 
plate  at  the  orientation  of  0°  at  the  strain  rate  of  1778/s. 

Fig.  18g  Hopkinson  bar  compression  stress-strain  data  from  test  56  obtained  on  a  radial  specimen  from  the  ATK 
plate  at  the  orientation  of  90°  at  the  strain  rate  of  1771/s. 

Fig.  18h  Hopkinson  bar  compression  stress-strain  data  from  test  57  obtained  on  a  radial  specimen  from  the  ATK 
plate  at  the  orientation  of  90°  at  the  strain  rate  of  1715/s. 

Fig.  19  The  geometry  of  the  copper  projectile  after  impact. 

Fig.  20  The  geometry  of  the  copper  projectile  after  impact. 

Fig.  21  The  radial  profiles  along  the  length  of  the  projectile  after  impact  for  two  experimental  tests  and  for  the 
computer  simulation. 

Fig.  22  The  Von-Mises  stress  contour  plot  through  the  middle  of  the  projectile  after  impact. 

Fig.  23  The  effective  plastic  strain  contour  plot  through  the  middle  of  the  projectile  after  impact. 

Fig.  24  The  Von-Mises  stress  contour  plot  at  the  impact  end  of  the  cylinder. 

Fig.  25  The  effective  plastic  strain  plot  at  the  impact  end  of  the  cylinder. 
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Fig.  1  The  one-half  section  of  copper  plate  which  is  referred  to  as  the  Nebraska  plate. 
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Fig.  3  The  specimen  layout  in  the  rim  of  the  annealed  Allianttech  Systems 
domed  C10100  copper  plate  from  Lot  A2365.  The  rim  material 
provided  specimens  for  metallography,  x-ray  diffraction,  quasi-static, 
high  strain  rate  mechanical  testing  and  Taylor  impact  tests  from  the' 
positions  indicated. 
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Fig.  5  The  initial  geometry  of  the  finite  element  simulation.  The  solid  finite 
element  model  of  the  30  caliber  copper  projectile  0.9  inches  long  with 
a  diameter  of  0.289  inches.  The  average  element  size  has  a  length  of 
0.02  inches. 


Fiq.  6  The  set  up  of  the  finite  element  simulation.  The  30  caliber  copper 
projectile  is  iocated  0.01  inches  from  a  rigid  wall  that  is  fixed  in  space. 
The  friction  existing  between  the  wall  and  the  projectile  was  assumed 
to  be  zero.  The  full  model  consisted  of  6360  elements  and  8150 

nodes. 
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Fig.  9  SEM  micrographs  of  copper  in  the  as-rolled  condition  and  after  . 
vacuum  annealing  for  one  hour  at  300°C.  Specimens  wero  etched 
concentrated  nitric  acid. 


Intensity  (Annealed  -  Worked) 


— H—  111  — B5—  200  — 220 

Fig.  11  The  differences  between  the  integrated  intensity  of  the  (111),  (200),  and  (220)  pole 
densities  of  the  annealed  condition  as  a  function  of  chi  angie  minus  the  integrated 
intensity  of  the  cold  worked  condition  at  the  same  chi  angle.  These  differences  in  pole 
figures  are  shown  for  the  specimen  taken  from  the  center  of  the  Nebraska  plate. 


-■-Ill  =  200 ---220 

Fig.  1 2  The  average  changes  in  the  differences  between  the  integrated  intensity  of  the  (111), 
(200),  and  (220)  pole  densities  of  the  annealed  condition  as  a  function  of  chi  angle 
minus  the  integrated  intensity  of  the  cold  worked  condition  at  the  same  chi  angle. 

These  differences  in  pole  figures  were  averaged  over  the  three  specimens  taken  from 
the  one  quarter  radial  position  of  the  Nebraska  plate. 
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-—111  —200  —  220 

Fig.  1 3  The  average  changes  in  the  differences  between  the  integrated  intensity  of  the 
(111),  (200),  and  (220)  pole  densities  of  the  annealed  condition  as  a  function  of  chi 
angle  minus  the  integrated  intensity  of  the  cold  worked  condition  at  the  same  chi  angle. 
These  differences  in  pole  figures  were  averaged  over  the  three  specimens  taken  from 
the  three-quarter  radial  position  of  the  Nebraska  plate. 
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Fig.  14  The  average  changes  in  the  differences  between  the  integrated  intensity  of  the 
f1 1 1)  (200)  and  (220)  pole  densities  of  the  annealed  condition  as  a  function  of  chi 
Lgle  minus  the  integrated  intensity  of  the  cold  worked  condition  at  the  same  chi  angle.  ^ 
These  differences  in  pole  figures  were  averaged  over  the  three  specimens  taken  from 
the  edge  position  of  the  Nebraska  plate. 
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ASV/T  Copper 
Lot  A2365 


Fig.  16  Rockwell  F  hardness  data  were  taken  on  the  “as  received  “  ATK 
plate  at  the  positions  indicated. 
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Stress  MPa  Stress  MPa 


—  Eng.  Stress-Strain  —  True  Stress-Strain 

Fig.  17a  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on 
longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  0°. 
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Fig.  17b  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on 
longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  0°. 
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Stress  MPa  Stress  MPa 


Fig.  17c  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on 
longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  90°. 


Eng.  Stress-Strain  —  True  Stress-Strain 


Fig.  17d  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on 
longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  90°. 
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Stress  MPa  Stress  MPa 


Eng.  Stress-Strain  —  True  Stress-Strain 


Fig.  17e  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on 
radial  specimen  from  the  ATK  plate  at  the  orientation  of  0° . 


Eng.  Stress-Strain  —  True  Stress-Strain 


Fig.  17f  Instron  compression  stress-strain  data  taken  at  a  strain  rate  of  0.022  per  second  on 
radial  specimen  from  the  ATK  plate  at  the  orientation  of  90°. 
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Fiq  18a  Hopkinson  bar  compression  stress-strain  data  from  test  58  obtained  on  a 

longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  0  at  the  strain  rate  of 
1748/s. 
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Fig.  18b  Hopkinson  bar  compression  stress-strain  data  from  test  59  obtained  on  a 

longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  0°  at  the  strain  rate  of 
1749/s. 
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Fig.  18c  Hopkinson  bar  compression  stress-strain  data  from  test  62  obtained  on  a 

longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  90“  at  the  strain  rate  of 
1803/s. 


Fig.  18d  Hopkinson  bar  compression  stress-strain  data  from  test  63  obtained  on  a 

longitudinal  specimen  from  the  ATK  plate  at  the  orientation  of  90“  at  the  strain  rate  of 
1763/s. 
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Fig.  18e  Hopkinson  bar  compression  stress-strain  data  from  test  60  obtained  on  a  radial 
specimen  from  the  ATK  plate  at  the  orientation  of  0°  at  the  strain  rate  of  1734/s. 
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Fig.  18f  Hopkinson  bar  compression  stress-strain  data  from  test  61  obtained  on  a  radial 
specimen  from  the  ATK  plate  at  the  orientation  of  0°  at  the  strain  rate  of  1778/s. 
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Fig.  18g  Hopkinson  bar  compression  stress-strain  data 
specimen  from  the  ATK  plate  at  the  orientation  of 


from  test  56  obtained  on  a  radial 
90“  at  the  strain  rate  of  1771/s. 
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Fig.  18h  Hopkinson  bar  compression  stress-strain  data  from  test  57  obtained  on  a  radial 
specimen  from  the  ATK  plate  at  the  orientation  of  90*  at  the  strain  rate  of  1715/s. 
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Fig.  19  The  geometry  of  the  copper  projectile  after  impact. 


Fig.  20  The  geometry  of  the  copper  projectile  after  impact. 
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tests  and  for  the  computer  simulation. 
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Fig.  22  The  Von-Mises  stress  contour  plot  through  the  middle  of  the 
projectile  after  impact. 
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Fig.  23  The  effective  plastic  strain  contour  plot  through  the  middle  of  the 
projectile  after  impact. 
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Fig.  24  The  Von-Mises  stress  contour  plot  at  the  impact  end  of  the  cylinder. 
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Fig.  25  The  effective  plastic  strain  plot  at  the  impact  end  of  the  cylinder. 
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DEVELOPMENT  OF  PERTURBED  PHOTOREFLECTANCE, 
IMPLEMENTATION  OF  NONLINEAR  OPTICAL  PARAMETRIC  DEVICES, 

AND  CHARACTERIZATION  OF  MICRO-CAVITY  LASERS 
BASED  ON  SEMICONDUCTOR  STRUCTURES 

Yujie  J.  Ding 
Assistant  Professor 

Department  of  Physics  and  Astronomy 
Bowling  Green  State  University 

Abstract 

We  have  observed  saturation  of  photoluminescence  peak  at  low  pump  intensities  in 
growth-interrupted  and  doped  asymmetric-coupled  quantum-well  structures.  We  believe  the 
samration  is  due  to  filling  of  the  exciton  states  localized  at  the  interface  islands.  We  have 
observed  increase  of  the  photoluminescence  decay  time  as  pump  intensity  increases  in  the 
same  structures.  We  have  also  observed  the  relative  saturation  of  donor-to-acceptor  transition 
in  growth-interrupted  and  compensate-doped  asymmetric-coupled  quantum-well  structure. 
The  saturation  is  accompaiued  by  anomalously  large  blue  shift  with  the  magnitude  as  large  as 
11  meV.  We  believe  this  shift  is  due  to  the  change  of  the  Coulomb  interaction  energy 
between  ionized  donors  and  acceptors  as  the  laser  intensity  increases. 

Based  on  our  design,  a  new  multilayer  structure  was  grown  for  demonstrating 
transversely-pumped  counter-propagating  optical  parametric  oscillation  and  amplification,  and 
achieving  surface-emitting  sum-firequency  generation  in  a  vertical  cavity. 

We  have  attempted  to  mode-lock  TirSapphire  laser  pumped  by  an  Argon  laser.  We  con¬ 
clude  that  stability  of  the  Argon  laser  is  crucial  for  achieving  stable  mode-locking. 
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DEVELOPMENT  OF  PERTURBED  PHOTOREFLECTANCE, 
IMPLEMENTATION  OF  NONLINEAR  OPTICAL  PARAMETRIC  DEVICES, 
AND  CHARACTERIZATION  OF  MICRO-CAVITY  LASERS 
BASED  ON  SEMICONDUCTOR  STRUCTURES 


Yujie  J.  Ding 


Introduction 

Recently,  it  has  been  shown  that  by  inteirupting  sample  growth  at  every  interface,  one 
can  obtain  multiple  photoluminescence  (PL)  peaks  with  separate  emission  energies  that 
correspond  to  the  excitonic  emissions  at  interface  islands  of  different  sizes  [1-4].  Because  of 
the  formation  of  these  interface  islands,  the  well  widths  at  these  islands  generally  differ  by 
one  monolayer  with  respect  to  the  designed  width  in  high  quality  samples  [5].  However,  the 
area  ratios  among  aU  these  islands  of  different  well  widths  are  random,  which  cannot  be  con¬ 
trolled  in  growth  process.  (Without  the  growth  interruption,  the  recombination  of  the  carriers 
in  the  wells  with  different  widths  results  in  the  inhomogeneous  broadening  in  the  PL  spec¬ 
trum.)  At  low  temperatures,  all  the  carriers  generated  by  the  pump  will  be  eventually  relaxed 
down  to  the  lowest  energy  levels  and  localized  in  the  islands,  resulting  in  very  large  carrier 
densities.  If  the  total  area  of  the  islands  is  small,  it  would  be  possible  to  completely  fill  exci- 
ton  states  in  these  islands  at  relatively  low  intensities  that  may  manifest  as  the  saturation  of 
the  PL  peaks.  It  is  worth  noting  that  in  growth-intetrupted  samples,  Band-filling  effects  are 
spatially-localized  effects,  due  to  additional  confinement  along  the  interface,  similar  to  situa¬ 
tion  in  quantum  dots  (i.e.  all  the  islands  are  spatially  isolated). 

Previously,  we  observed  [6]  satruation  of  photoluminescence  peaks.  We  believe  that  it 
is  due  to  band-filling  effects  at  the  interface  islands  as  a  result  of  the  growth  interruption. 
The  intensities  required  to  observe  the  saturation  reflect  the  total  area  of  the  interface  islands, 
thus  the  interface  toughness. 


To  determine  the  characteristic  carrier  density  for  completely  filling  the  interface  islands, 
one  needs  to  measure  the  carrier  recombination  times.  Furthermore,  the  dependence  of  the 
recombination  time  on  the  excitation  intensity  may  provide  information  about  nature  of  the 
recombination  processes. 

Each  sample  for  studying  saturation  of  photoluminescence  peak,  photoluminescence 
decay,  and  shift  of  the  transition  energy  was  grown  by  MBE  on  a  semi-insulating  GaAs  sub¬ 
strate  at  the  temperature  of  600  C  in  collaboration  with  Naval  Research  Labs.  The  epitaxial 
layers  consist  of  20  periods,  each  of  which  is  composed  of  two  narrow  asymmetric  coupled 
GaAs  quantum  wells  with  the  designed  thicknesses  of  50  A  and  65A,  coupled  by  35 A- 
AlojsGaofisAs  barriers,  see  Fig.  1.  The  thicknesses  of  the  barriers  between  the  adjacent 
periods  are  150  A.  During  the  sample  growth  there  is  an  interruption  for  60  seconds  at  every 
interface.  Because  of  this  growth  interruption,  interface  islands  with  sizes  larger  than  the 
average  exciton  radius  are  formed,  allowing  excitons  being  spatially-localized  within  these 
islands  with  separate  optical  transition  energies  from  that  of  free-excitons  [1].  As  a  result,  in 
each  designed  well  the  absorption  and/or  emission  peaks  are  separated  from  each  other 
corresponding  to  one  or  a  few  monolayer  thickness  difference.  We  have  grown  three  sam¬ 
ples:  two  of  them  (samples  #1  and  2)  are  undoped  and  in  the  third  sample  (sample  #3),  65-A 
well  in  each  unit  is  compensate-doped  with  Be  and  Si  of  densities  3  x  10^^  cm“^. 

Recently,  surface-emitting  green  light  was  obtained  [7]  by  frequency-doubling  infrared 
laser  beam  (1.06  pm)  in  the  waveguide  based  on  periodically  modulated  second-order  suscep¬ 
tibility  in  alternating  Al^Gaj.^As  and  AlyGaj.yAs  (x  *y)  layers.  When  the  multilayers  are 
sandwiched  between  two  quarter-wave  stacks,  large  increase  in  the  conversion  efficiency  was 
observed  [8]  though  quasi  phase-matching  was  not  established.  Following  Ref.  [9],  second- 
order  susceptibility  of  asymmetric-coupled  quantum-well  (QW)  domain  structures  were  meas¬ 
ured  in  the  surface-emitting  geometry  [10].  The  maximum  conversion  efficiency  so  far  is 
still  less  than  1%/W.  Recently,  we  proposed  a  novel  practical  scheme  for  implementation  of 
the  cascaded  nonlinearity  using  surface-emitting  second-harmonic  generation  (SHG)  in  the 
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Fabry-Perot  cavity.  We  have  shown  that  such  scheme  can  be  efficiently  used  for  optical 
power  limiting  and  optical  phase  conjugation  at  low  input  power  [11].  Most  recently  [12], 
we  propose  to  achieve  nearly  100%  conversion  efficiency  of  SHG  for  the  low  input  power, 
by  combining  quasi  phase-matching  and  cavity  enhancements  in  semiconductor  multilayers  or 
asymmetric  QW  domain  structures.  Thus,  our  investigation  leads  to  the  implementation  of 
practical  frequency  doublers  which  can  cover  the  range  from  blue  to  infrared.  More  impor¬ 
tantly,  we  proposed  to  implement  tunable  optical  parametric  oscillators  (OPOs)  and  amplifiers 
[13]  based  on  a  novel  configuration.  Frequency  doublers,  optical  parametric  oscillators  and 
amplifiers,  and  the  nonlinear  optical  devices  based  on  the  cascaded  second-order  nonlineari¬ 
ties  have  potential  applications  in  generation  of  blue  light,  generation  and  amplification  of 
tunable  mid-TP  light,  optical  communication,  ultrafast  detection,  sensor  protection,  real-time 
holography,  or  optical  lithography. 

Previously,  we  proposed  to  use  semiconductor  multilayers  to  generate  surface-emitting 
second-harmonic  radiation  [12]  to  implement  transversely-pumped  counter-propagating  OPOs 
[13].  Recently,  we  designed  the  first  structure.  The  epitaxial  layers  consist  of  two  Bragg 
reflectors  and  alternating  layers  for  achieving  quasi-phase  matching. 

To  characterize  semiconductor  lasers  in  time-resolved  domain,  an  xfltrafast  laser  source 
(i.e.  mode-locked  TirSapphire  laser)  is  required  to  excite  the  carriers  to  the  high-energy  sub¬ 
bands.  The  relaxation  processes  can  be  then  probed  via  different  techiuques. 

Methodology 

Our  asymmetric-coupled  quantum-well  stracture  is  pumped  by  a  CW  Argon  laser  at  the 
wavelength  of  5145  A.  The  photoluminescence  was  collected  by  a  monochromator  via  a 
lens. 

For  the  measurement  of  the  PL  decay,  we  used  a  mode-locked  Ar"^  laser  as  our  excita¬ 
tion  pulse  with  the  pulse  duration  of  150  ps  and  output  wavelength  of  5145  A.  The  temporal 
traces  of  the  PL  signal  were  taken  via  a  streak  camera  with  a  time  resolution  of  20  ps.  Fig. 


2  shows  our  schematic  set-up. 

Our  optimized  design  of  the  multilayer  structure  is  based  on  our  vigorous  consideration 
of  siuface-emitting  ftequency  doublers  and  transversely-pumped  counter-propagating  OPOs 
and  OPAs  [Fig.  3(a)],  see  Refs.  [12,13]. 

The  schematic  set-up  for  Argon-laser-pumped  mode-locked  TirSapphire  laser  and  the 
TirSapphire  laser  cavity  are  shown  in  Fig.  4.  We  have  followed  the  manual  for  Ti:Sapphire 
provided  by  Qark-MXR  Inc.  The  mode  structure  and  stability  were  determined  by  eyes  after 
expanding  the  laser  beam.  The  output  laser  pulse  from  mode-locked  TirSappK'e  laser  can  be 
sent  to  Coumarin  460  for  generating  two-photon  fluorescence. 

Results 

For  the  sample  #1,  the  PL  spectra  for  several  pump  intensities  are  shown  in  Fig.  5.  At 
laser  intensity  of  9.7  mW/cm^  there  are  two  emission  peaks:  the  one  on  the  long  wavelength 
side  (~  7780  A)  corresponds  to  the  emission  of  excitons  at  the  interface  islands  while  the 
other  one  (-7773  A)  corresponds  to  the  free  excitons.  When  we  change  the  laser  intensity 
from  9.7  mW/cm^  to  1.4  W/cm^  at  4  K,  we  can  see  that  the  emission  peak  for  the  localized 
excitons  loses  its  relative  strength. 

Due  to  growth  interruption,  a  single  PL  peak  breaks  into  two  because  of  the  formation 
of  interface  islands  with  the  size  larger  than  the  exciton  radius.  At  low  temperatures,  all  the 
carriers  generated  by  the  pump  laser  will  be  eventually  relaxed  down  to  the  lowest  energy 
levels  and  localized  in  the  islands,  resulting  in  large  carrier  densities.  If  the  total  area  of  the 
islands  is  small,  it  would  be  possible  to  completely  fill  exciton  states  in  these  islands  at  rela¬ 
tively  low  intensities,  which  manifests  as  the  saturation  of  the  PL  peaks.  This  type  of  the 
band-filling  effect  only  occurs  at  the  spatially-localized  islands.  The  laser  intensity  required 
to  almost  completely  fill  the  localized  exciton  states  is  more  than  two  orders  of  magnitude 
lower  than  that  obtained  before  [6]. 
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We  have  made  the  time-resolved  PL  measurements  in  our  sample.  Fig.  6  shows  the 
typical  temporal  PL  traces  detected  at  the  center  wavelength  of  e^hhi  emission  peak  (the 
excitonic  emission  peak)  as  a  result  of  the  carrier  recombination  at  the  interface  islands.  At 
the  low  excitation  intensity  (207  W/cm^),  the  PL  signal  at  the  e^hhi  (II)  emission  peak  has  a 
decay  time  of  about  269  ps.  As  the  intensity  increases,  the  decay  time  increases.  As  shown 
in  Fig.  6,  when  the  laser  intensity  is  414,  621,  and  828  W/cm^,  the  decay  time  is  about  326, 
537,  and  666  ps. 

In  the  quasi-CW  regime,  the  density  of  excitons  can  be  determined  as 


Nex=''z: — 

^^aser 


(1) 


where  liases  is  the  laser  intensity,  a  is  the  absorption  coefficient,  and  "hcoiaser  is  the  energy  of  a 
single  photon.  The  intensity  required  to  completely  fill  the  ejhhi  exciton  states  is  about 
1.4W/cm^.  Assuming  a=10000cm~^  at  the  pumping  wavelength  in  our  experiments,  the 
exciton  density  is  then  estimated  to  be  9.75  x  lO^^cm"^.  The  corresponding  area  density  is 
1.46  X  lO^cm"^.  This  carrier  density  is  more  than  three  orders  of  magnitude  lower  than  that 
in  Ref.  [6]. 

For  the  sample  #2,  the  PL  spectra  for  several  pump  intensities  are  plotted  in  Fig.  7.  At 
the  pump  intensity  of  25  W/cm^,  there  are  three  peaks.  The  peak  located  at  7777  A 
corresponds  to  the  free  excitonic  Cjlihi  emission.  The  other  two  peaks  at  7788  A  and  7808 
A  correspond  to  the  recombination  of  the  excitons  at  interface  islands.  As  the  intensity 
increases,  the  relative  strengths  of  these  two  peaks  on  the  long  wavelength  side  decrease.  At 
the  intensity  of  153  W/cm^,  these  two  peaks  are  almost  completely  saturated.  Similar  to  the 
sample  #1,  we  measured  photoluminescence  decay  for  the  e^hhi  transition.  The  recombina- 

A 

tion  time  is  determined  to  be  278  ps.  Using  Eq.  (1),  for  the  intensity  of  153  W/cm"^  we 
estimated  the  exciton  density  to  be  1.7x  lO^^cm"^,  which  is  one  order  of  magnitude  lower 
than  that  in  Ref.  [6]. 
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For  the  sample  #3,  we  measured  PL  spectra  at  different  pump  intensities,  see  Fig.  8.  At 
the  intensity  of  0.5  W/cm^,  there  is  a  broad  peak  located  around  7950  A.  This  peak  is  the 
result  of  donor-to-acceptor  transition.  There  is  a  peak  located  on  the  short  wavelength  side 
(-7800  A).  As  the  intensity  increases,  the  relative  strength  of  this  peak  increases.  At  the 
intensity  of  423  W/cm^,  this  peak  dominates  the  PL  spectrum.  We  believe  this  peak 
corresponds  to  the  e^hhi  excitonic  transition.  Meanwhile,  there  is  a  blue  shift  of  the  broad 
peak,  see  Fig.  9.  The  maximum  amount  of  the  shift  is  about  11  meV  as  the  laser  intensity 
increases  from  0.5  W/cm^  to  423  W/cm^.  The  amount  of  the  shift  is  several  times  larger 
than  that  in  bulk  sample.  Thus,  we  use  the  word  "anomalously"  to  emphasize  the  magnitude 
of  the  shift.  As  the  laser  intensity  increases,  the  average  distance  between  the  donors  and 
acceptors  participating  the  recombination  decreases.  As  a  result,  the  Coulomb  interaction 
energy  between  the  ionized  donors  and  acceptors  increases.  The  peak  energy  of  the  donor- 
to-acceptor  transition  also  increases.  We  believe  this  is  the  origin  of  the  blue  shift. 

Based  on  Refs.  [12,13],  we  had  already  designed  an  optimized  structure  [see  Fig.  3(b)]. 
We  had  grown  this  structure  in  collaboration  with  Drs.  J.  L.  Loehr  and  J.  Ehret  at  Wright 
Labs.  We  will  test  the  performance  of  this  structure  as  an  efficient  frequency  doubler  and 
optical  parametric  oscillator  and  amplifier  at  Bowling  Green  State  University.  The  pump, 
input  and  output  wavelengths  are  designed  to  be  1.06  p.m,  1.58  pm,  and  3.23  p.m.  By  chang¬ 
ing  the  incident  angle,  one  can  tune  the  output  wavelengths  [13]. 

The  TirSapphire  laser  is  pumped  by  a  re-ftmushed  Argon  laser  (Coherent  Innova  90), 
see  Fig.  4.  For  4-watt  pump  power  of  a  multi-line  Argon  laser,  conversion  efficiency  as  high 
as  20%  was  achieved  in  CW  regime.  We  had  tried  our  best  to  mode-lock  TirSapphire  laser. 
We  observed  two-photon  fluorescence  in  Coumarin  460,  however,  was  not  able  to  stablize 
the  mode-locked  laser  output.  We  believe  that  mode  structure  and  pointing  stability  could  be 
the  problems  for  mode-locking  the  TirSapphire  laser.  When  there  is  no  aperture  for  the 
Argon  laser,  high-order  modes  other  than  TEMqq  exist  in  the  Argon  output  beam  profile.  In 
addition,  we  have  crudely  estimated  the  pointing  stability  as  -  lOOprad,  which  is  an  order  of 
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magnitude  larger  than  that  required  for  stable  mode-locking. 

Conclusion 

We  have  observed  saturation  of  photoluminescence  peak  at  low  pump  intensities.  We 
have  measured  time-resolved  photoluminescence  decay  in  growth-inteirupted  and  undoped 
asymmetric-coupled  quantum  wells.  We  have  subsequently  determined  decay  times  and 
characteristic  carrier  densities  for  the  observed  photolmninescence  intensity  saturation.  We 
have  observed  anomalously  large  blue  shift  of  the  donor-to-acceptor  transition  peak  in  a 
compensate-doped  asymmetric-coupled  quantum-well  stmcture.  We  have  proposed  a 
mechanism  for  such  a  large  shift.  We  have  grown  a  multilayer  structure  that  can  be  used  to 
implement  an  optical  parametric  oscillator  and  amplifier  and  frequency  doubler  in  a  novel 
configuration.  Finally,  we  have  tried  to  mode-lock  Ti:Sapphire  laser  and  summarized  poten¬ 
tial  problems  causing  unstable  mode-locked  output. 

As  a  result  of  our  research  supported  by  this  grant,  we  have  four  conference  presenta¬ 
tions  [a,b,c,d]  and  submitted  two  journal  papers  for  publication  [e,f].  Three  more  journal 
papers  based  on  our  results  will  be  submitted  for  publication  [g,h,i]. 
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COMPUTATIONS  OF  DRAG  REDUCTION  AND  BOUNDARY  LAYER  STRUCTURE  ON  A 
TURBINE  BLADE  WITH  AN  OSCILLATING  BLEED  FLOW 


Elizabeth  A.  Ervin 

Abstract 

Current  studies  suggest  that  an  oscillating  bleed  flow  passed  through  a  turbine  rotor  blade  can 
reduce  the  friction  drag  on  the  blade.  Furthermore,  the  resulting  boundary  layer  structure  and  possible 
separation  from  the  blade  will  decrease  the  effective  available  area  for  the  high  speed  flow  between 
adjacent  blades,  improving  off-design  performance.  A  fundamental  study  of  the  flow  dynamics  around  a 
blade  with  oscillating  cooling  flow  is  being  investigated.  A  transient  solution  of  the  Reynolds-averaged 
Navier-Stokes,  continuity,  and  energy  equations  is  being  developed  to  analyze  the  effects  of  a  pulsing  jet  on 
vortex  development  and  interaction  with  the  blade  surface.  The  computational  investigation  is  the  first  to 
study  the  effects  of  an  oscillating  bleed  flow  on  drag  reduction  and  boundary  layer  structure  about  a  turbine 
blade.  This  is  a  potential  application  for  MEMS  to  control  turbine  off-design  performance. 

Existing  software  will  be  verified  and  revised  as  part  of  the  proposed  effort  for  the  addition  of 
cooling.  Furthermore,  a  more  sophisticated  turbulence  model  will  be  implemented  to  describe  the 
convection  and  diffusion  of  turbulent  kinetic  energy  that  would  be  expected  with  the  ejection  of 
cooling. 
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COMPUTATIONS  OF  DRAG  REDUCTION  AND  BOUNDARY  LAYER  STRUCTURE  ON  A 
TURBINE  BLADE  WITH  AN  OSCILLATING  BLEED  FLOW 


Elizabeth  A.  Ervin 


INTRODUCTION 

A  variable  area  turbine  typically  consists  of  a  standard  rotor  and  a  variable  area  turbine  nozzle, 
which  uses  pivoting  vanes  or  moveable  sidewalls.  Potential  operating  and  performance  benefits  of  a 
variable  area  turbine  include  high  engine  efficiency,  modulation  of  work  split  between  the  high  and  low 
pressure  turbines,  flow  capacity  control,  and  very  low  blade  vibration,  when  operated  at  optimum 
rotational  speed  and  pressure  ratio  over  a  range  of  engine  power.  A  large  effort  to  develop  gas  turbine 
engines  with  a  variable  area  turbine  took  place  in  the  late  1970’s  and  early  1980’s  (Chappie,  et  al.,  1980). 
However,  the  cost  of  implementing  pivoting  vanes  or  moveable  sidewalls  in  high  temperature  environments, 
typical  of  turbines,  was  considered  prohibitive. 

Recent  developments  suggest  that  an  oscillating  cooling  flow  through  a  turbine  rotor  blade  may 
reduce  the  friction  drag  on  the  blade.  The  resulting  boundary  layer  structure  and  possible  separation  from 
the  blade  would  decrease  the  effective  available  area  for  the  high  speed  flow  between  adjacent  blades.  This 
would  improve  off-design  performance,  similar  to  that  of  a  variable  area  turbine.  Oscillating  blade  flow  can 
be  produced  by  acoustic  perturbation  and  could  make  strategic  use  of  emerging  micro-electro-mechanical 
systems.  Allen  and  Glezer  (1995),  for  example,  are  using  a  micro-electro-mechanical  jet  actuator  to 
produce  an  oscillating  jet  to  control  a  primary  jet. 

The  oscillating  flow  would  be  used  to  control  the  vortex  development  on  the  blade  surface.  Vortex 
development  and  interaction  with  a  surface  are  complex  processes,  and  measurements  in  operating  engines 
are  difficult  and  expensive.  As  a  consequence,  a  computational  model  of  a  blade  with  oscillating  cooling 
flow  is  proposed.  A  transient  solution  of  the  Reynolds-averaged  Navier-Stokes,  continuity,  and  energy 
equations  will  permit  analysis  of  the  effects  of  the  oscillating  jet  on  vortex  development  and  interaction  with 
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the  blade  surface.  The  results  of  this  study  will  be  used  to  develop  an  experiment  to  be  conducted  at 
Wright  Laboratory. 

BACKGROUND 

A  wall  jet  consists  of  an  outer  shear  layer  and  an  inner  layer  that  behaves  like  a  viscous  boundary 
layer.  Cohen,  et  al.  (1992)  calculated  two  unstable  modes:  an  inviscid  mode  that  depicts  the  large-scale 
disturbances  in  the  free  shear  layer,  and  a  viscous  mode  that  concerns  the  small-scale  disturbances  near  the 
wall.  They  showed  that  the  relative  importance  of  each  mode  can  be  controlled  by  small  amounts  of 
blowing  and  suction. 

Studies  of  an  oscillating  plane  wall  jet,  in  an  external  flow,  have  shown  10  to  40  percent  reductions 
in  shear  drag,  with  minimal  effect  on  maximum  velocity  decay  and  jet  spreading  rate  (Fasel,  et  al.,  1995). 
Detailed  particle  image  velocimetry  (PIV)  measurements  of  an  acoustically  perturbed  laminar  plane  wall  jet 
showed  that  the  perturbation  enhances  growth  of  a  vortex  in  the  outer  shear  layer.  This,  in  turn,  interacts 
with  the  iimer  layer,  resulting  in  a  counter-rotating  vortex  pair.  (Shih  and  Gogineni,  1995).  The  vortex  pair 
remains  attached  to  the  wall  under  the  influence  of  the  downstream  vortex  pait  until  it  is  further  diffused 
downstream.  When  the  vortex  pair  is  dislodged  from  the  surface,  jet  spreading  and  transition  to  turbulence 
follow.  The  forcing  frequency  determines  the  distance  between  adjacent  vortex  pairs,  which  in  turn 
controls  the  flow  field. 

Little  has  been  done  to  numerically  simulate  the  interaction  of  turbine  blade  bleed  jets  with  the 
primary  flow.  Vogel  (1994)  developed  a  steady  solution  of  the  Reynolds-averaged  Navier-Stokes, 
continuity,  and  energy  equations  to  model  flow  over  a  turbine  blade  section  with  film  cooling.  Internal 
coolant  geometry  was  modeled  as  well  as  the  outer  flow  region.  The  model  demonstrated  vortex 
development,  typical  of  jets  in  crossflow,  and  compared  favorably  with  flow  visualization  experiments  that 
were  conducted.  The  effect  of  the  steady-state  coolant  jets  on  the  shear  drag,  if  any,  was  not  reported. 

Clearly,  more  study  is  needed  to  see  if  a  periodic  jet  can  reduce  the  shear  drag  on  a  turbine  blade 
and  simultaneously  control  the  flow  dynamics.  No  data  concerning  vortex  development  and  control  with 
oscillating  cooling  flow  on  an  airfoil  appears  to  have  yet  been  published.  Hence,  it  is  believed  that  the 
current  study  will  is  the  first  examination  of  drag  reduction  and  boundary  layer  structure  on  a  turbine  blade 
with  an  oscillating  coolant  flow. 
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The  original  two-dimensional  model  uses  the  Baldwin  and  Lomax  (1978)  model  to  account  for 
turbulence.  Algebraic  turbulence  models,  such  as  this,  are  not  able  to  model  the  convection  and  diffusion 
of  turbulent  kinetic  energy  that  are  expected  with  the  ejection  of  cooling.  The  new  two-equation  algorithm 
is  based  on  the  recently  developed  model  (Goldberg,  1994).  This  model  is  similar  to  the  well  known  k-z 
model  of  Launder  and  Spalding  (1974),  where  k  is  the  turbulent  velocity  fluctuation  kinetic  energy  and  e  is 
the  viscous  dissipation.  i2  is  the  undamped  eddy  viscosity  and  is  equal  to  k^/e.  It  offers  two  advantage  over 
previous  models; 

1 .  ie  is  zero  at  the  wall,  unlike  s. 

2.  The  model  is  not  based  on  wall  functions,  which  makes  it  adaptable  for  external  flows. 

The  addition  of  a  two-equation  turbulence  model  required  two  new  transport  equations,  which  has  resulted 
in  a  massive  software  revision,  impacting  dozens  of  subroutines. 

COMPUTATIONAL  METHOD 

The  software  used  for  the  simulation  of  the  governing  equations  was  developed  by  Allison  Engine 
Company,  under  U.  S.  Air  Force  Contract  F33615-90-C-2028  for  Wright  Laboratory,  to  study  vane-blade 
interaction,  as  described  by  Rao,  et  al.  (1994a,  1994b).  This  software  was  verified  and  revised  as  part  of 
the  proposed  effort  for  the  addition  of  cooling  as  described  above. 

CODE  DESCRIPTION 

The  conservative  forms  of  the  transient  Reynolds-averaged  Navier-Stokes,  continuity,  and  energy 
equations  are  solved  on  a  blade-to-blade  stream  surface  of  revolution.  A  numerical  finite  difference 
technique  is  used,  with  central  differencing  for  second  order  accuracy  in  space,  and  a  five-stage  Runge- 
Kutta  algorithm  for  second  order  accurate  integration  in  time.  An  artificial  dissipation  model  that  blends 
second  and  fourth  order  differences  is  added  to  damp  out  non-physical  oscillations  produced  by  central 
differencing.  It  utilizes  pressure  as  a  sensor  to  capture  physical  discontinuities  such  as  shock  waves  and 
stagnation  points. 

The  code  uses  a  body  fitted  hyperbolic  0-grid  embedded  in  a  rectangular  H-grid  as  shown  in 
Figure  1 .  The  outer  grid  resolves  the  free  stream  flow  and  the  0-grid  is  used  in  the  boundary  layer  region, 
with  fine  grid  resolution  near  the  surface. 
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Non-reflective  inflow  and  outflow  boundary  conditions  are  calculated  based  on  the  methodology 
developed  by  Cline  (1977).  No-slip  conditions  are  used  on  the  airfoil  surface(s)  and  periodic  boundary 
conditions  are  used  in  the  polar  direction.  The  interface  between  the  stator  exit  and  the  rotor  inlet  can  be 
modeled  with  overlapping  H-grids  and  a  time-space  phase-lag  procedure,  originally  developed  by  Erdos 
(1977). 


In  the  numerical  simulation,  body-fitted  curvilinear  coordinates  are  utilized  and  the  flow  is  mapped 
to  uniformly  spaced  rectangular  coordinate  region  with  the  Jacobian  matrix  of  the  transformation.  Also, 
the  variables  are  non-dimensionalized.  The  second  viscosity  coefficient.  A,,  is  set  equal  to  -2/3  p,  where  p 
is  the  dynamic  viscosity,  and  the  Prandtl  number  is  constant. 

TURBULENCE  MODEL 

The  prior-existing  two-dimensional  code  uses  the  Baldwin  and  Lomax  (1978)  model  to  account  for 
turbulence.  Algebraic  turbulence  models,  such  as  this,  are  not  able  to  model  the  convection  and  diffusion 
of  turbulent  kinetic  energy  that  are  expected  with  the  ejection  of  cooling.  The  new  two-equation  algorithm 
is  based  on  the  recently  developed  k-(?  model  (Goldberg,  1 994).  This  model  is  similar  to  the  well  known  k-e 
model  of  Launder  and  Spalding  (1974),  where  k  is  the  turbulent  velocity  fluctuation  kinetic  energy  and  8  is 
the  viscous  dissipation.  is  the  undamped  eddy  viscosity  and  is  equal  to  l^le.  It  offers  two  advantage  over 
previous  models: 

1 .  is  zero  at  the  wall,  unlike  8. 

2.  The  model  is  not  based  on  wall  functions,  which  makes  it  adaptable  for  the  H-grid,  which  has 
no  wall. 

The  transport  equations  for  the  turbulent  kinetic  energy,  k,  and  the  undamped  eddy  viscosity,  (2, 
are: 


dQ  5Fj  _ 

a  “  axj 


where: 
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(^+—) 

(v  +  ^) 

I 


S  = 

source  -t  ^  n 

--(Vv,.V«)f(2-C.,)--(2-C,)t 


(2) 


Equation  2  uses  the  conservative  form  of  the  transport  equations  for  k  and  12  as  described  by  (Goldberg, 
1994).  The  source  term,  P,  is  defined  as  the  production  of  turbulent  energy  by  the  work  of  the  main  flow 
against  the  Reynolds  stresses: 


p_  Its  1  ^ 

^•[cbc,  ac,  3^«J  dxj 


The  latter  form  of  P  is  based  on  the  Boussinesq  approximation  (1877)  and  is  the  form  used  by  Launder  and 
Spalding  (1974).  The  u,  terms  represent  the  mean  velocity  components  and  Ui'  is  the  fluctuating  velocity 
component  in  the  i-direction.  The  eddy  viscosity,  is  defined  as: 

Pt  =  (4) 

The  two  equation  k-  (2  model  affects  the  transport  equations  for  momentum  and  energy  and  these 
effects  are  summarized  here  due  to  lack  of  completeness  in  the  literature.  The  eddy  viscosity,  Pb  is  added 
to  the  dynamic  viscosity,  p,  including  the  expression  for  second  viscosity  coefficient.  The  p/Pr  terms  are 

.  up, 

replaced  with - 1 — —  ,  as  in  the  Baldwin  and  Lomax  (1978)  model.  Pr,  is  typically  set  to  0.9  for  air 

vPr 

flows  and  is  the  value  used  in  the  prior-existing  software. 

The  constants  at,  Og,  Cgi,  Ce2,  and  C^  are  chosen  to  be  1.0,  1.3,  1.44,  1.92  and  0.09,  respectively, 
and  are  the  standard  coefficients  as  recommended  by  Launder  and  Spalding  (1974)  for  plane  jets,  mixing 


6-7 


layers  and  flows  near  walls.  The  function  is  unity  in  the  Launder  and  Spalding  model,  and  was  modified 
by  Launder  and  Sharma  (1974)  and  later  by  Lam  and  Bremhorst  (1981)  to  extend  the  model  to  near- wall 
regions.  Goldberg  (1994)  used  a  similar  formulation: 

(5) 

where  Rt  is  a  form  of  a  turbulence  Reynolds  number: 

R,  =  — =  ^ 

ve  V 

The  constants,  A^,  and  Ag  were  chosen  to  be  2.5  x  lO"^  and  jlvi  (k 
prior  experimentation  (Goldberg,  1 994). 

The  boundary  conditions  are  as  follows: 

•  Set  /?  =  A:  =  0  at  solid  walls. 

•  Set  =  0(10'^)  at  the  freestream  and  initial  conditions. 

•  Prescribe  the  freestream  k  based  on  a  given  turbulence  intensity,  Tu,  using:  k 

•  Extrapolate  k  and  from  interior  points  to  outflow  boundaries. 

Using  the  k-!^  model  in  the  prior-existing  software,  required  the  transformation  of  Equations  (1) 
and  (2)  to  the  2-D  meridonial  coordinate  system  used  by  Rao,  et  al.  (1994a,  1994b).  This  coordinate 
system  was  developed  by  Vavra  (1974)  and  is  depicted  in  Figure  2.  Details  of  the  development  of  the  full 
Navier  Stokes  equations  in  this  coordinate  system  is  not  published;  therefore,  this  was  not  a  trivial  task. 
Furthermore,  the  meridonial  coordinate  system  equations  were  mapped  to  a  body  fitted  coordinate  system 
with  the  Jacobian  matrix  of  the  transformation.  The  addition  of  a  two-equation  turbulence  model  required 
two  new  transport  equations,  which  resulted  in  a  massive  software  revision,  impacting  dozens  of 
subroutines.  The  convective  terms  in  the  transport  equations  used  first  order  upwind  differencing  (for 
stability),  while  the  diffusive  and  source  terms  used  the  standard  central-type  discretizations. 


(6) 

=  0.41),  respectively,  and  n  =  2,  by 
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BOUNDARY  CONDITIONS 


The  boundary  conditions  must  also  be  supplemented  to  allow  for  a  cooling  flow.  The  film  cooling 
will  be  modeled  in  a  manner  smular  to  that  of  Vogel  (1994).  The  ejection  region  will  require  a  separate  H- 
mesh  to  represent  the  flow  channel.  At  the  inlet  to  the  flow  channel,  non-reflecting  characteristic  variable 
boundary  conditions  will  be  used.  These  conditions  will  assume  subsonic  normal  inflow.  The  inlet 
pressure  and  velocity  will  vary  periodically  to  create  an  oscillating  flow  at  the  blade  surface.  At  the  coolant 
exit,  the  H-mesh  will  interface  with  the  outer  0-grid.  Two  layers  of  dummy  cells  will  be  used  to  specify 
boundary  conditions  and  ensure  that  the  block  interface  nodes  behave  like  interior  nodes. 

RESULTS  AND  DISCUSSION 

COMPARISON  OF  TURBULENCE  MODELS 

The  following  plots  show  results  for  different  turbulence  models.  The  calculations  were  performed 
at  an  axial  chord  Reynolds  numbers  of  3230  and  at  a  pitch  to  axial  chord  ratio,  p/cx  =  0.944.  The  low 
Reynolds  number  was  selected  to  compare  the  separation  calculations.  The  computations  were  down  with 
both  a  laminar  model  and  a  Baldwin-Lomax  turbulence  model  {Tu  =  0%)  and  Anally,  a  k-Si  model  with  Tu 
=  0%. 


Figures  3  through  5  are  airfoil  surface  plots  using  s/s^^  as  the  distance  along  the  airfoil.  The 
variable,  s^ax,  is  defined  separately  for  the  pressure  {sls^  <  0)  and  suction  (s/smax  >  0)  surfaces  so  that 
s/SnHx  varies  from  zero  at  the  leading  edge  to  +/-  1  at  the  trailing  edge. 

Figure  3  depicts  the  calculated  pressure  ratio,  p/po,  around  the  airfoil  surface,  where  po  is  the  total 
pressure.  Skin  friction  and  heat  transfer  are  of  primary  interest  and  these  are  described  in  the  following 
two  figures. 

In  Figure  4,  the  skin  friction  coefficient,  Cf,  around  the  airfoil  is  described.  Cf  is  actually  the  shear 
stress  at  the  wall,  calculated  with  dimensionless  velocity,  distance  and  laminar  viscosity.  On  the  pressure 
surface  (s/Sn^x  <  0),  the  negative  Cf  values  correspond  to  the  region  between  the  two  stagnation  points,  both 
located  on  this  side  of  the  airfoil.  On  the  suction  surface  (s/Sn^x  >  0),  the  region  of  negative  Cf  corresponds 
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to  a  region  of  separation.  This  was  confirmed  by  an  examination  of  velocity  profiles  in  this  region  (not 
shown). 


The  heat  transfer  is  characterized  in  Figure  5  using  the  Stanton  number,  St,  defined  here  as: 


St  = 


wall 


T  -T 

o  w 


where  T  is  the  temperature,  n  is  the  direction  normal  to  the  surface,  pi  is  the  laminar  viscosity,  k  is  the 
thermal  conductivity  of  the  gas,  T^  is  the  wall  temperature  and  To  is  the  total  temperature.  The 
temperatures  are  non-dimensionalized  with  To  and  the  other  variables  are  non-dimensionalized  as  well.  The 
heat  transfer  is  a  maximum  at  the  leading  edge  stagnation  point  and  just  before  the  trailing  edge  on  the 
suction  side.  The  minimum  values  occur  at  the  trailing  edge  stagnation  point  and  at  the  transition  to 
turbulence  on  the  suction  side.  A  second  minimum  occurs  at  the  onset  of  separation.  The  transition  to 
turbulence  is  further  described  below.  In  all  of  these  figures,  the  B-L  data  is  obscured  by  the  k-IX,  data, 
showing  good  agreement  between  the  two  models. 

The  maximum  y+  =  yu*/v  in  the  O-grid,  where  y  is  the  normal  distance  from  the  airfoil  surface 
and  V  is  the  kinematic  viscosity,  occurred  near  the  leading  edge.  The  friction  velocity,  u+,  is  the  square 
root  of  the  dimensionless  shear  stress  at  the  wall,  Cf,  divided  by  the  dimensionless  density  ratio,  shown  at 
the  leading  edge  for  the  three  models  in  Figure  6.  The  three  models  exhibited  similar  behavior. 

Conclusions 

A  comparison  of  turbulence  models  was  used  to  examine  a  low  Reynolds  number  flows  typical  of  a 
low  pressure  turbine  stage.  The  calculations  confirmed  the  phenomena  of  separation  at  low  Re,  low  Tu  and 
p/Cx  =  0.94.  The  turbulence  models  exhibit  similar  behavior,  showing  that  the  new  turbulence  model  is 
working  properly.  This  report  concerns  the  first  phase  of  a  study  of  the  effects  of  the  oscillating  jet  on 
vortex  development  and  interaction  with  the  blade  surface. 
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Figure  1  .  Overlapped  O-H  grid  system. 
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Figure  2  .  Quasi-three-dimensional  stream  surface  coordinate  system. 


6-12 


Figure  3.  Dimensionless  Pressure 


Figure  4.  Skin  Friction 
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Figure  6a.  Velocity  profile  at  the  leading  edge  using  the  laminar  model. 


Airfoil  surface:  u+  vs.  y+,  s/s 
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Figure  6b.  Velocity  profile  at  the  leading  edge  using  the  Baldwin-Lomax  turbulence  model. 
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Figure  6c.  Velocity  profile  at  the  leading  edge  using  the  k-Gl  turbulence  model. 
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ABSTRACT 


The  primary  purpose  of  this  research  project  was  to  purchase  a  two-channel  low  signal  to 
noise  laser  Doppler  velocimeter  (LDV)  signal  processor  so  that  Dr.  Gould  and  North  Carolina 
State  University  (NCSU)  will  have  the  capability  to  make  non-intrusive  velocity  measurements 
in  the  Rolls  Royce/Wright  Laboratory  swirl  combustor  at  NCSU.  The  experimental  activity  in 
test  cell  18  at  the  Aeropropulsion  and  Power  Directorate  at  Wright  Laboratory  has  increased  to 
the  point  where  all  test  stands  are  currently  occupied.  In  fact,  the  Rolls  Royce/Wright  Laboratory 
(WL/POPT)  high  swirl  combustor  is  now  in  storage  due  to  the  lack  of  test  space.  In  light  of  this, 
a  collaborative  arrangement  with  Dr.  A.  S.  Nejad  and  his  team  of  scientists  at  WL/POPT  is 
being  forged  so  that  the  swirl  combustor  experimental  program,  initiated  at  WL/PORT,  can 
continue  at  NCSU. 

This  equipment,  a  TSI  model  IFA  750  LDV  signal  processor,  is  currently  being  used  with 
an  existing  two  component  LDV  system  at  NCSU  and  has  been  fully  tested.  This  signal 
processor  uses  a  correlation  based  analysis  technique  to  obtain  the  Doppler  frequency  and  thus 
can  be  used  to  make  LDV  measurements  in  flows  where  the  signal  to  noise  ratio  is  low. 
Combusting  flows  and  flow  where  spatially  resolved  measurements  are  needed  are  examples  of 
where  low  signal  quality  would  occur. 
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LOW  SIGNAL  TO  NOISE  SIGNAL  PROCESSOR 
FOR  LASER  DOPPLER  VELOCIMETRY 


Dr.  Richard  D.  Gould 


1.  INTRODUCTION 

Dr.  Gould  has  developed  a  unique  technical  relationship  with  the  researchers  at  Wright 
Laboratory  in  the  Advanced  Propulsion  and  Power  Directorate  (WL/POPT)  over  the  past  six 
years  through  collaborative  work  supported  by  AFOSR  through  the  Summer  Research  Program 
(SRP),  the  Summer  Research  Extension  Program  (SREP)  and  the  University  Resident  Research 
Program  (URRP).  He  has  helped  setup  4  different  experiments  while  at  Wright  Laboratory  and 
also  has  setup  a  low  speed  wind  tunnel  at  North  Carolina  State  University  (NCSU).  These 
research  projects  have  resulted  in  15  technical  publications,  nine  of  which  have  been  co-authored 
by  Wright  Laboratory  researchers.  These  include:  (1)  a  unique  experiment  where  two-point 
velocity  correlation  measurements  and  single  point  autocorrelation  measurements  were  made  in 
the  flow  behind  an  axisymmetric  sudden  expansion,  (2)  two-component  velocity  measurements 
in  the  Rolls-RoyceAVright  Laboratory  high  swirl  combustor,  (3)  simultaneous  three-component 
velocity  measurements  in  the  isothermal  flow  behind  a  bluff  body  flame  holder,  and  (4)  planar 
laser  induced  fluorescence  (PLIF)  measurements  of  acetone  and  OH  concentration  downstream 
of  a  bluff  body  flame  holder  with  normal  fuel  jet  injection. 

The  purpose  this  research  program  was  to  purchase  equipment  to  give  North  Carolina  State 
University  a  long  term  capability  to  conduct  experiments  of  interest  to  researchers  at  WL/POPT 
at  North  Carolina  State  University.  These  can  include  velocity  and  temperature  measurements  in 
the  subsonic  Rolls-Royce/Wright  Laboratory  high  swirl  combustor.  The  majority  of  funds  were 
used  to  purchase  a  two-channel  correlation  based  LDV  signal  processor  so  that  LDV 
measurements  in  combusting  flows  (where  low  signal  to  noise  signals  are  present)  can  be  made 
at  NCSU.  Substantial  cost  sharing  (37%  of  equipment  cost)  of  this  equipment  was  provided  by 
the  College  of  Engineering  at  NCSU  and  by  the  Applied  Energy  Research  Laboratory  (AERL)  at 
NCSU.  All  the  other  equipment  necessary  for  conducting  these  proposed  tests  are  available  at 
NCSU  and  thus  the  purchase  of  this  signal  processor  completes  the  LDV  system.  The  group  at 
WL/POPT  has  agreed  to  loan  the  subsonic  Rolls-Royce/Wright  Laboratory  high  swirl  combustor 
test  section  to  Dr.  Gould  so  that  the  test  program  initiated  at  Wright  Laboratory  can  be  continued 
at  NCSU.  The  motivation  for  conducting  the  tests  at  NCSU  as  opposed  to  at  Wright  Laboratory 
is  because  all  the  test  facilities  in  Building  18  of  the  Aeropropulsion  and  Power  Directorate  will 
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be  occupied  with  other  tests  for  the  next  two  to  three  years.  Additional  funds  will  be  sought  from 
AFOSR  or  Wright  Laboratory  to  support  graduate  students  at  NCSU  to  run  these  tests. 

2.  EQUIPMENT  DESCRIPTION 

The  purchased  equipment,  a  TSI  Model  IFA  750,  two-channel  laser  Doppler  velocimeter 
(LDV)  correlation  based  signal  processor  (one  master  and  one  slave),  computer  interface,  and 
data  analysis  software,  is  presently  being  used  together  with  the  optical  portion  of  an  existing 
and  fully  operational  three-color,  three-component  LDV  system.  In  an  effort  to  save  the  AFOSR 
and  North  Carolina  State  University  money,  a  sales  department  demonstration  unit  (with  full 
warranty)  was  purchased.  This  decision  allowed  for  the  purchase  of  the  IFA  750  series  rather 
than  the  originally  proposed  IFA  650  signal  processor  series  from  TSI,  Inc.  The  maximum 
Doppler  frequency  capability  of  the  IFA  750  is  100  MHz,  while  the  maximum  Doppler 
frequency  capability  of  the  IFA  650  is  only  35  MHz.  LDV  signal  processors  analyze  the 
Doppler  bursts  created  when  small  seed  particles  pass  through  crossed  laser  beams  (i.e.,  probe 
volume)  giving  the  instantaneous  velocity  component  perpendicular  to  the  fringes  in  the  probe 
volume.  This  scattered  light  is  sensed  by  photodetectors  which  convert  optical  energy  into  an 
electrical  signal.  One  signal  processor  is  required  for  each  velocity  component.  The  strength  of 
this  signal  is  a  function  of  laser  power,  seed  particle  material  and  size,  collection  optics  and 
photodetector  efficiency.  This  two-channel  signal  processor  can  be  connected  to  any 
combination  of  two  of  the  three  channels  (i.e.  velocity  components)  in  sequence  (i.e.,  1-2,  1-3, 
2-3)  to  obtain  three  component  velocity  information  at  each  measurement  point. 

The  addition  of  this  Doppler  burst  signal  processing  hardware  and  software  to  the  existing 
optical  portion  of  the  LDV  system  completes  the  three-component  LDV  system  (two  channels 
simultaneously)  at  NCSU  thus  forming  the  basis  for  a  laboratory  specializing  in  novel  non- 
intmsive  experimental  measurements  in  turbulent  isothermal  and  reacting  flows.  In  particular, 
the  turbulence  characteristics  of  flow  fields  at  elevated  temperatures,  where  hot  wire 
anemometry  can  not  be  used,  can  be  investigated  with  this  non-invasive  instrument. 

3.  EQUIPMENT  TESTING 

The  IFA  750  LDV  signal  processor,  interface  hardware,  and  software  has  been  fully  tested 
using  two  different  flows.  The  hardware  and  software  were  found  to  operate  as  advertised  by  the 
manufacturer.  The  first  flow  used  to  test  the  signal  processor  was  an  aluminum  oxide  particle 
laden  pulsatile  jet  flow  having  maximum  axial  velocities  in  excess  of  200  m/s.  The  second  flow 
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used  was  a  counter-flow  diffusion  flame  flow  which  had  maximum  velocities  on  the  order  of  0.2 
m/s.  This  flow  was  seeded  with  0.5  micron  diameter  aluminum  oxide  particles.  Velocity 
measurements  where  made  with  and  without  combustion  in  this  flow.  The  signal  processor  and 
included  acquisition  and  analysis  software  (both  TSI  FIND  version  4.0  for  DOS  and  TSI  FIND 
version  1.0  for  Windows)  operated  flawlessly  with  both  these  flows.  In  addition,  comparisons 
between  the  measurements  made  with  this  new  signal  processor  and  those  made  using  counter- 
based  LDV  signal  processors  were  also  conducted.  These  comparisons  showed  excellent 
agreement,  with  the  new  processor  giving  higher  data  rates  then  the  counter  processors,  as 
expected.  In  conclusion,  the  requested  signal  processor  was  purchased,  installed  and  tested  under 
this  research  program.  A  collaborative  arrangement  with  Dr.  A.  S.  Nejad  and  his  team  of 
scientists  at  WL/POPT  is  being  discussed  so  that  the  swirl  combustor  experimental  program, 
initiated  at  WL/PORT,  can  continue  at  NCSU. 
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Modeling  and  Control  of  Rotating  Stall  and 
Surge  for  Compressor  Systems  in  Turbojet  Engines 


Guoxiang  Gu 

Department  of  Electrical  and  Computer  Engineering 
Louisiana  State  University,  Baton  Rouge,  LA  70803-5901 

December  19,  1996 


Abstract 

Axial  flow  compressors  are  the  vital  part  of  turbine-based  aeroengines.  However,  the  engine 
performance  is  effectively  reduced  by  rotating  stall  and  surge  in  axial  flow  compressors,  which 
are  instabilities  that  arise  in  the  unsteady  fluid  dynamics.  The  difficulty  in  suppressing  rotating 
stall  and  surge  via  active  control  lies  in  the  fact  that  these  two  instabilities  are  associated  with 
nonlinear  bifurcations.  This  motivated  our  bifurcation  approach  to  rotating  stall  and  surge  control 
that  is  pursued  in  the  past  year.  Specifically,  bifurcation  control  with  output  feedback  is  studied, 
and  stabilizability  is  characterized  for  both  stationary  and  nonstationary  bifurcations.  These 
results  complement  those  developed  by  Abed  and  Fu,  and  are  applicable  to  compressor  control. 
Both  linear  and  nonlinear  control  laws  are  developed  for  rotating  stall  control  based  on  Moore- 
Greitzer  model.  A  surprising  result  is  that  the  hysteresis  loop  associated  with  rotating  stall  can 
be  eliminated  through  the  use  of  lumped  actuator  and  sensor  that  is  contrast  to  the  existing 
control  method  where  distributed  actuators  or  sensors  are  used.  Design  method  for  surge  control 
is  also  studied  that  effectively  reduces  the  impact  of  both  deep  surge  (pure  surge)  and  classic  surge 
(coupled  with  rotating  stall).  In  addition,  the  PI  offered  a  special  topics  course  on  bifurcation 
analysis  and  compressor  control  that  trained  graduate  students  for  undertaking  research  work  in 
this  active  research  area.  MATLAB  programs  are  worked  out  to  simulate  distributed  nonlinear 
model  of  axial  flow  compressors.  Hence  with  the  effort  of  the  PI  and  his  students,  a  strong  research 
program  is  established  in  LSU  on  active  control  of  axial  flow  compressors  that  will  enhance  the 
operability  of  the  compression  system  and  improve  the  future  aeroengines. 
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Modeling  and  Control  of  Rotating  Stall  and  Surge  for 
Compressor  Systems  in  Turbojet  Engines 

Guoxiang  Gu 


1  Introduction 

Axial  flow  compressors  are  subject  to  two  distinct  aerodynamic  instabilities,  rotating  stall  and 
surge,  which  effectively  limit  the  compressor  operability.  Both  these  instabilities  are  disruption 
of  the  normal  operating  condition  designed  for  steady  and  axisymmetric  flow.  Rotating  stall  is 
a  severely  non-axisymmetric  distribution  of  axial  flow  velocity  which  manifests  itself  as  a  region 
of  severely  reduced  flow  that  rotates  at  a  fraction  of  the  rotor  speed.  Prolonged  operation  under 
this  condition  may  break  rotor  blades,  and  burn  the  turbine  [28].  Surge,  on  the  other  hand,  is 
an  axisymmetric  pumping  oscillation  which  can  cause  flameout  and  thus  engine  damage  as  well. 
Both  lead  to  large  penalties  in  performance  of  aeroengines. 

Because  rotating  stall  and  surge  are  difficult  to  predict  accurately  during  design,  problems  are 
often  identified  at  a  later  stage  that  incur  great  expense  in  engine  development  program.  This  fact 
motivated  the  use  of  feedback  control  to  enhance  compressor  operability  by  actively  suppressing 
rotating  stall  and  surge.  Two  key  developments  are  noticable  in  control  of  compression  systems. 
The  first  is  the  low  order  state-space  model  developed  by  Moore  and  Greitzer  [28]  that  can  be 
extended  into  high  order  ones  as  well  as  “distributed”  models  in  [2,  26].  It  captures  the  nonlinear 
dynamics  of  the  compression  system  through  its  bifurcation  characteristic  [27].  The  second  is 
bifurcation-based  rotating  stall  control  law  developed  by  Abed  and  his  coworkers  [25,  37],  and 
was  shown  to  be  effective  for  the  implementation  in  industrial  turbomachinery  by  Nett  and  his 
group  [11,  12].  Other  important  research  work  along  this  direction  is  the  linear  control  method 
which  extends  the  stable  operating  range  of  the  compressor  up  to  20%  [29,  30],  and  the  back- 
stepping  mthod  reported  in  [23]  leading  to  a  global  stabihzation  feedback  control  law. 

Our  research  program  is  the  continuation  of  the  existing  work  on  compressor  control  that  is 
characterized  by  a  bifurcation  approach.  Indeed  the  only  mathematical  tool  used  in  our  research  is 
the  classic  bifurcation  theory.  The  use  of  bifurcation  theory  in  control  of  rotating  stall  and  surge  is 
crucial  to  the  extension  of  the  operating  range  of  the  compression  system,  and  to  the  improvement 
of  the  aeroengine  ultimately.  Specifically,  bifurcation  control  with  output  feedback  was  studied 
in  this  research  program  with  the  objective  of  applications  to  rotating  stall  and  surge  control. 
This  approach  yields  several  interesting  results,  which  are  vaUdated  with  computer  simulations. 
In  the  next  several  sections,  our  methodology  and  research  results  will  be  reported  in  more  details 
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together  with  our  future  research  plan. 


2  Research  Problems  and  Methodology 


This  section  addresses  the  research  problems  of  rotating  stall  and  surge  control,  and  the  method¬ 
ology  employed  in  our  research  program. 

Rotating  Stall  and  Surge  in  Axial  Flow  Compressors 


Axial  flow  compressors  are  subject  to  two  distinct  aerodynamic  instabilities,  rotating  stall 
and  surge,  which  can  severely  limit  the  compressor  performance.  Both  these  instabilities  are 
disruption  of  the  normal  operating  condition  which  is  designed  for  steady  and  axisymmetric  flow. 
The  transition  from  normal  compressor  operation  into  rotating  stall  is  depicted  in  Figure  1  where 
$  is  the  circumferential  mean  of  the  flow  coefficient  (f>,  and  '5'  is  the  nondimensionalized  pressure 
rise.  As  the  flow  coefficient  through  the  compressor  is  decreased  (i.e.,  as  the  downstream  throttle 
closes  in  an  experiment),  the  pressure  rise  increases.  This  trend  continues  until  the  system  goes 
into  either  rotating  stall,  surge  (deep  surge),  or  both  (classic  surge). 


Figure  1  Schematic  compressor  characteristic,  showing  rotating  stall 


For  the  case  of  rotating  stall,  the  lowest  flow  coefficient  at  which  the  compressor  can  operate 
with  axisymmetric  flow  is  point  A,  the  peak  of  the  characteristic.  At  lower  flows,  an  abrupt 
transition  occurs  into  rotating  stall  (point  B).  There  is  a  substantial  drop  in  pressure  rise  and  a 
decrease  in  flow  coefficient  (segment  A-B).  This  condition  wiU  persist  until  the  flow  is  increased 
to  point  C.  Thus  there  exists  a  severe  ‘hysteresis’,  or  range  of  flow  coefficients  at  which  two  stable 
operating  conditions  exist  -  steady  axisymmetric  flow  and  rotating  stall.  Once  a  compressor 
enters  fully  developed  rotating  stall,  both  rotor  and  stator  blades  pass  in  and  out  of  the  stalled 
flow  causing  tremendous  stress.  Any  substantial  length  of  time  in  this  mode  can  result  in  excessive 
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internal  temperatures  due  to  low  efficiency  associated  with  the  presence  of  rotating  stall.  In 
addition,  an  even  more  serious  consequence  that  can  occur  in  an  engine  is  that  the  low  flow  rates 
obtained  during  rotating  stall  can  lead  to  substantial  overtemperatures  in  the  burner  and  turbine 
[14].  At  present,  the  only  remedy  to  get  out  of  rotating  stall  is  to  shut  down  the  engine  and 
restart  it  again  [28]. 

Rotating  stall  is  a  severely  non-cixisymmetric  distribution  of  axial  flow  velocity,  though  steady 
in  an  appropriate  (moving)  reference  frame,  around  the  annulus  of  the  compressor,  taking  the 
form  of  a  wave  or  ‘stall  cell’,  that  propagates  in  the  circumferential  direction  at  a  fraction  of 
the  rotor  speed.  Surge,  on  the  other  hand,  is  an  axisymmetric  oscillation  of  the  mass  flow  along 
the  axial  length  of  the  compressor.  Deep  surge  is  a  mostly  axisymmetric  oscillation  with  such 
a  large  variation  of  mass  flow  that  during  part  of  the  cycle  the  compressor  operates  in  reversed 
flow.  The  frequency  of  the  surge  oscillation  is  typically  an  order  (or  more)  of  magnitude  less  than 
that  associated  with  the  passage  of  rotating  stall  cells.  If  surge  occurs,  the  transient  consequences 
such  as  large  inlet  overpressures  can  also  be  severe.  However  the  circumstances  may  well  be  more 
favorable  for  returning  to  unstaUed  operation  by  opening  either  the  throttle  or  internal  bleed 
valves,  since  the  compressor  can  operate  in  an  unstaUed  condition  over  part  of  each  surge  cycle. 
Often  surge  and  rotating  staU  are  coupled  (classic  surge)  although  each  can  occur  without  the 
other.  For  the  case  of  classic  surge,  the  compressor  may  pass  in  and  out  of  rotating  staU  during 
a  surge  cycle,  with  rotating  staU  characteristics  appearing  to  be  quite  similar  to  those  obtained 
during  steady-state  operation.  Thus  rotating  staU  and  surge,  though  coupled,  are  weU  defined 
enough  that  each  can  be  studied  alone  for  low  speed  axial  flow  compressors  [29]. 

Rotating  staU  and  surge  are  mostly  caused  by  disturbances.  Those  having  largest  and  most 
destabiUzing  effects  are:  circumferential  distortion,  planar  turbulence,  and  combustion  [22].  AU 
of  these  types  of  disturbances  present  in  fuU-scale  aeroengines  and  are  major  sources  of  rotating 
staU  and  surge. 

•  Circumferential  distortion  refers  to  non-axisymmetric  flow  patterns  that  are  generated  by 
upstream  structures  such  as  bends  in  inlet  duct  or  boundary  layer  separation  caused  by  high 
angle  of  attack  at  the  engine  inlet.  The  inlet  distortion  can  also  be  correlated  with  aircraft 
angle  of  attack  and  yaw  angle. 

•  Planar  turbulence  refers  to  axisymmetric  osciUations  in  the  flow  field  that  are  generated, 
for  example,  by  inlet  buzz  or  ingestion  of  wakes  from  nose  gear  or  other  aircraft.  Planar 
turbulence  is  an  inherently  unsteady  flow  and  has  been  recognized  as  an  important  source 
of  loss  in  staU  margin. 
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•  Combustion  process  introduces  large  unsteady  back-pressure  disturbances  to  the  compres¬ 
sion  system  causing  steady  state  operating  conditions  to  exhibit  fluctuations  in  pressure  and 
mass  flow  large  enough  to  cause  the  system  to  diverge. 

Thus  substantial  rotating  stall  and  surge  margins  are  required  in  the  selection  of  a  compres¬ 
sor  operating  point  in  order  to  maintain  steady  axisymmetric  flow.  Consequently,  compression 
systems  are  forced  to  operate  with  far  less  performance  operating  point  than  point  A,  the  peak  of 
the  compressor  characteristic  (Figure  1).  Even  then,  with  all  the  above  mentioned  disturbances 
present  in  the  worst  case,  it  does  not  seem  possible  for  the  compressor  to  escape  rotating  stall 
and  surge  unless  some  control  action  is  taken. 

Our  Methodology  to  Approach  Compressor  Control  Problems 

Since  rotating  stall  and  surge  significantly  limit  the  performance  of  turbine-based  aeroengines, 
and  have  catastrophic  consequences  if  occur  in  jet  planes,  compressor  control  becomes  the  priority 
in  the  list  of  research  problems  for  AFOSR.  Our  research  program,  though  small,  shows  the  effort 
of  AFOSR  to  resolve  this  important  research  problem  in  the  near  future.  It  is  recognized  in  both 
the  research  community  and  AFOSR  that  employing  feedback  control  to  suppress  rotating  stall 
and  surge  is  essential  for  extending  compressor  operating  range  and  to  improving  performance  of 
the  future  aeroengines.  Hence  the  research  group  in  MIT,  led  by  Greitzer  and  in  University  of 
Maryland,  led  by  Abed  have  been  strongly  supported  in  the  past  by  AFOSR.  The  Moore- Greitzer 
model  laid  solid  foundation  for  the  use  of  feedback  control  for  suppressing  rotating  stall  and  surge. 
The  analytical  low  order  state-space  model  derived  in  [28]  captures  the  characteristics  of  rotating 
stall  and  surge,  and  used  in  both  MIT  [29,  30]  and  University  of  Maryland  [25,  37]  to  tackle 
rotating  stall  control.  This  work  is  further  pursued  by  the  research  group  in  Georgia  Institute  of 
technology,  led  by  Nett  [6,  12].  It  is  interesting  to  note  that  classic  bifurcation  theory  provided 
a  powerful  tool  for  both  analysis  and  synthesis  of  rotating  stall  control.  The  papers  of  [2,  5] 
were  the  first  to  describe  rotating  stall  with  subcritical  pitchfork  bifurcation,  and  surge  as  Hopf 
bifurcation.  The  work  of  McCaughan  in  [27]  gave  a  through  analysis  of  rotating  stall  and  surge, 
in  connection  with  the  various  parameters  of  the  Moore-Greitzer  model.  The  bifurcation  analysis 
yields  a  nonlinear  feedback  controller  proposed  in  [25,  37]  that  stabilizes  the  critical  operating 
point,  and  later  it  is  experimentally  validated  in  [6,  12]. 

Our  research  program  has  focused  on  bifurcation  approach  to  rotating  stall  and  surge  control 
using  the  low  order  Moore-Greitzer  model.  This  is  a  continuation  of  the  existing  work  in  com¬ 
pressor  control,  and  has  potential  to  make  contributions  to  nonlinear  robust  control.  It  should 
be  clear  that  the  difficulty  associated  with  compressor  control  is  due  to  the  lack  of  corresponding 
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theory  and  practice  for  bifurcation  control,  by  the  fact  that  rotating  stall  and  surge  are  both 
phenomena  of  nonlinear  bifurcations.  Very  few  results  are  available  for  bifurcation  control  except 
those  in  [3,  4,  19,  20]  where  state  feedback  is  employed  for  bifurcation  stabilization.  Moreover 
the  success  in  nonlinear  control  of  rotating  stall  as  reported  in  [25,  37,  6,  12]  is  based  on  bifur¬ 
cation  theory.  These  considerations  motivated  us  to  adopt  a  bifurcation  control  methodology  to 
compressor  control.  A  more  profound  reason  for  using  bifurcation  approach  is  due  to  nonlinear 
robust  control.  In  the  past,  nonlinear  control  has  focused  on  extension  of  linear  control  theory 
and  design  to  nonlinear  systems.  The  current  trend  in  nonlinear  robust  control  follows  the  same 
path.  However  nonlinear  systems  have  their  unique  features  that  do  not  exist  for  linear  systems. 
Simple  generalization  of  linear  control  theory  to  nonlinear  systems  may  not  work.  This  is  espe¬ 
cially  true  for  bifurcated  systems  which  involve  uncertain  parameters.  At  critical  values  of  the 
uncertain  parameters,  more  than  one  equilibria  are  born  at  which  stability  changes.  Often  the 
critical  modes  of  the  linearized  control  systems  are  uncontrollable,  or  unobservable,  or  both.  This 
is  where  linear  control  theory  fails,  and  is  exactly  the  same  as  the  rotating  stall  control  problem. 
The  development  of  bifurcation  control  theory  is  clearly  an  important  part  of  nonlinear  robust 
control,  and  has  no  parallel  in  linear  robust  control.  Thus  the  bifurcation  approach  to  compressor 
control  problems  will  advance  our  knowledge  to  bifurcation  stabilization  and  nonlinear  robust 
control  as  well. 

3  Research  Results 

The  schematic  compression  system  is  shown  in  Figure  2  below: 


Figure  2  Schematic  of  compressor  showing  nondimension alized  lengths. 


The  total  pressure  at  the  upper  stream  of  the  compressor  is  denoted  by  px-  The  air  flows 
through  inlet  guide  vanes  that  straighten  the  flow.  The  compressor  acts  like  an  actuator  that  raises 
the  pressure  of  the  flow  at  the  back  of  the  compression  system.  The  purpose  of  the  compression 


8-7 


system  is  to  generate  the  required  pressure  rise  which  is  the  pressure  difference  between  ps  and 
PT-  The  established  pressure  rise  is  then  used  to  provide  the  thrust  for  the  jet  airplane.  Hence 
the  compression  system  is  the  heart  of  the  aeroengine.  The  ultimate  objective  of  our  research 
program  is  the  improvement  of  the  aeroengine  performance. 

In  the  past  year,  our  research  program  has  focused  on  rotating  stall  and  surge  control  that 
are  essential  to  compressor  performance.  Classic  bifurcation  theory  is  employed  to  analyze  the 
problems  of  rotating  stall  and  surge,  and  to  obtain  the  feedback  controllers  that  stabilize  the 
critical  operating  points  and  enlarge  the  operating  range  of  the  compression  system.  Our  research 
results  are  reported  in  a  series  of  papers  ([16], [17], [9], [35], [21]),  and  are  summarized  as  follows. 

•  Bifurcation  stabilization  with  output  feedback  [16]. 

As  mentioned  earlier,  rotating  stall  and  surge  controls  are  closely  connected  with  bifur¬ 
cation  stabilization,  because  rotating  stall  corresponds  to  subcritical  pitchfork  bifurcation 
and  surge  corresponds  to  Hopf  bifurcation.  Stabilization  of  nonlinear  control  systems  with 
smooth  state  feedback  control  has  been  studied  by  a  number  of  people  [1,  3,  4,  7,  36].  An 
interesting  situation  for  nonlinear  stabibzation  is  when  the  linearized  system  has  uncontrol¬ 
lable  modes  on  imaginary  axis  with  the  rest  of  modes  stable.  This  is  so  called  critical  cases 
for  which  the  linear  theory  is  inadequate.  It  becomes  more  intricate  if  the  underlying  nonlin¬ 
ear  system  involves  a  real- valued  parameter.  At  critical  values  of  the  parameter,  linearized 
system  has  unstable  modes  corresponding  eigenvalues  on  imaginary  axis,  and  additional 
equilibrium  solutions  will  be  born.  The  bifurcated  solutions  may,  or  may  not  be  stable.  The 
instability  of  the  bifurcated  solution  may  cause  “hysteresis  loop”  in  bifurcation  diagram 
for  both  subcritical  pitchfork  bifurcation  and  Hopf  bifurcation  [18],  and  induce  undesirable 
physical  phenomina.  This  is  exactly  the  case  of  rotating  stall  in  axial  flow  compressors. 
Hence  bifurcation  stabilization  is  an  important  topic  in  nonlinear  control. 

While  most  of  the  existing  work  in  the  open  literature  considers  only  state  feedback  for 
bifurcation  stabilization,  compressor  control  systems  employ  output  feedback  because  often 
some  of  the  state  variables  are  not  measurable,  or  too  expensive  to  measure.  It  is  thus 
necessary  to  investigate  bifurcation  stabilization  for  the  case  when  only  output  measure¬ 
ments  are  available,  and  study  the  stabilizabiUty  property  for  various  bifurcated  systems. 
Our  research  work  on  bifurcation  stabilization  is  reported  in  [16].  Speciflcally,  the  nonlinear 
system  under  consideration  has  single-input/single-output,  and  it  involves  a  single  parame¬ 
ter.  At  the  critical  value  of  the  parameter,  the  linearized  system  possesses  either  a  simple 
zero  eigenvalue,  or  a  pair  of  imaginary  eigenvalues,  and  the  bifurcated  solution  is  unstable. 
Output  feedback  stabilization  via  smooth  local  controllers  is  studied  for  both  stationary 
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and  nonstationary  bifurcation.  Two  results  are  established  in  [16]  for  bifurcation  stabiliza¬ 
tion.  The  first  one  is  stabilizability  conditions  for  the  case  where  the  critical  mode  is  not 
linearly  observable  through  output  measurement.  It  is  shown  that  nonlinear  controllers  do 
not  offer  any  advantage  over  the  linear  ones  for  bifurcation  stabilization.  The  second  one  is 
stabilizability  conditions  for  the  case  when  the  critical  mode  is  linearly  observable  through 
output  measurement.  It  is  shown  that  linear  controllers  are  adequate  for  stabilization  of 
transcritical  bifurcation,  and  quadratic  controllers  are  adequate  for  stabilization  of  pitchfork 
and  Hopf  bifurcations,  respectively.  The  proofs  are  constructive.  Thus  the  results  in  this 
paper  can  be  used  to  synthesize  stabilizing  controllers,  if  they  exist. 


Nonlinear  feedback  for  rotating  stall  control  [17]. 

A  nonlinear  feedback  control  law  is  proposed  for  rotating  stall  control  in  [17].  This  feedback 
control  law  is  different  from  that  of  [25,  37]  in  that  no  distributed  sensors  are  required, 
and  output  measurement  is  chosen  as  pressure  rise  that  is  a  lumped  parameter.  This  is 
important  as  distributed  sensors  such  as  hot  wire  for  flow  rate  measurements  are  expensive 
and  delicate,  while  pressure  transducers  are  more  durable  to  volatile  flow  field.  This  was 
the  starting  point  for  considering  feedback  control  law  of  the  form 

u{t)  =  —= 

where  denotes  the  pressure  rise.  The  proposed  control  system  employs  pressure  rise 
as  output  measurement  and  throttle  position  as  actuating  signal  for  which  both  sensor 
and  actuator  exist  in  the  current  configuration  of  axial  compressors,  and  are  lumped  in 
nature  that  is  contrast  to  other  control  method  that  employs  either  distributed  actuators, 
or  distributed  sensors,  or  both. 


It  should  be  emphasized  that  the  results  of  the  paper  of  [17]  is  obtained  entirely  with  classic 
nonlinear  bifurcation  theory.  This  is  due  to  the  fact  that  linear  control  theory  fails  to 
apply  to  the  bifurcated  systems  such  as  rotating  stall  and  surge  in  compression  system. 
Classical  bifurcation  analysis  for  nonlinear  dynamics  is  used  to  derive  a  nonlinear  feedback 
control  law  that  eliminates  the  hysteresis  loop  associated  with  rotating  stall  and  extends  the 
stable  operating  range  in  axial  compressors.  The  stability  of  the  critical  operating  point  for 
controlled  compressor  is  established  using  the  center  manifold  theorem.  Although  the  results 
in  [17]  are  primitive  and  no  advanced  bifurcation  control  developed  in  [4,  3]  is  used,  it  yields 
similar  results  as  in  [25,  37].  The  stabilization  results  are  verified  via  computer  simulations 
with  high  order  compression  systems  that  are  surprising.  More  importantly,  the  use  of 
pressure  rise  as  output  measurement  also  gives  the  opportunity  for  surge  control.  Recall 
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that  pure  surge  dynamics  is  governed  by  difTerential  equations  of  flow  rate  and  pressure 
rise,  but  not  the  amplitude  of  the  disturbance  flow.  Hence  the  stabiflzing  control  laws  of 
[25,  37]  can  not  work  for  surge  control.  We  are  currently  investigating  the  possibility  of 
surge  control  with  the  same  feedback  control  law. 

•  Linear  and  nonlinear  feedback  laws  for  rotating  stall  control  [9]. 

The  control  system  proposed  in  the  paper  of  [9]  is  similar  to  that  of  [17]  except  that  the 
output  measurements  can  be  either  pressure  rise  or  averaged  flow  rate.  Both  linear  and 
nonlinear  feedback  control  laws  are  investigated  that  yield  similar  results  for  rotating  stall 
control.  The  foundation  of  the  paper  lies  in  those  results  established  in  [16].  Specifically, 
the  results  on  nonlinear  bifurcation  stabilization  in  [16]  are  applied  to  rotating  stall  control 
to  derive  stabilizing  feedback  controllers  in  [9].  It  should  be  clear  that  the  challenge  to  the 
proposed  control  system  is  that  the  critical  mode  of  the  linearized  system  corresponding 
to  rotating  stall  is  neither  controllable  nor  observable.  Both  hnear  and  nonlinear  feedback 
control  laws  are  proposed  and  are  shown  to  be  effective  in  elimination  of  the  hysteresis  loop 
associated  with  rotating  stall  and  in  extension  of  the  stable  operating  range  of  the  axial  flow 
compressor. 

Although  the  results  in  this  paper  are  applications  of  those  in  [16],  it  has  several  interesting 
points.  First,  it  relates  rotating  stall  control  to  equivalent  bifurcation  stabilization,  that  was 
studied  in  [4,  3,  16].  Hence  bifurcation  stabilization  can  be  used  to  synthesize  stabilizing 
controllers  for  rotating  stall  control.  Second,  it  indicates  the  stability  ranges  for  different 
feedback  controllers,  and  these  ranges  are  finite.  Moreover  it  is  possible  that  the  stabilizing 
ranges  of  the  feedback  gains  can  be  zero  for  some  of  the  compressor  control  systems,  and 
thus  stabilizing  controllers  do  not  exist  in  some  cases.  Fortunately  stabilizing  controllers  do 
exist  for  practical  compressor  control  systems  such  as  the  one  at  MIT.  Again  the  results  in 
this  paper  are  validated  with  computer  simulations. 

•  Further  results  on  rotating  stall  control  [35]. 

In  compressor  control  with  throttle  as  actuators,  an  important  consideration  is  that  the 
operating  point  is  different  from  the  critical  point  of  bifurcation,  and  that  the  throttle  has 
to  be  positive  due  to  the  physical  constraint.  This  problem  is  addressed  in  [35]  where  sensor 
signals  are  averaged  flow  rate  on  the  circumference  of  the  compressor  or  the  pressure  rise. 
Sufficient  conditions  are  derived  for  the  control  law  gains  to  guarantee  that  the  subcritical 
pitchfork  bifurcation  responsible  for  hysteresis  is  rendered  supercritical  and  that  the  the 
bifurcated  solution  is  asymptotically  stable.  The  proposed  control  laws  give  practical  so- 
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lution  for  rotating  stall  in  axial  flow  compressors.  The  numerical  examples  show  that  the 
transformation  of  the  bifurcation  from  subcritical  to  supercritical  and  the  elimination  of  the 
hysteresis  region. 

•  Bifurcation  based  surge  control  [21]. 

The  focus  of  the  paper  [21]  is  surge  control  for  axial  flow  compressors.  Although  there 
exist  a  family  of  state  feedback  laws  which  stabilize  the  nonaxisymmetric  equilibria  near 
the  operating  point  and  eliminate  the  hypothesis  induced  by  rotating  stall,  Hopf  bifurcation 
associated  with  surge  stiU  exist  under  rotating  stall  control  laws.  The  results  in  [21]  introduce 
test  functions  whose  zeros  are  critical  to  Hopf  bifurcation  for  the  closed-loop  system  where 
nonlinear  feedback  control  laws  are  employed.  These  test  functions  are  given  in  compact 
form.  A  particular  test  function  is  also  developed  to  determine  stability  of  the  periodic 
solutions  born  at  Hopf  bifurcation.  The  analysis  based  on  these  test  functions  leads  to  a 
new  method  of  feedback  design  for  control  of  both  stationary  and  Hopf  bifurcation  in  axial 
flow  compressors.  Using  the  techniques  proposed  in  [21],  feedback  controllers  can  be  designed 
to  meet  several  bifurcation  control  requirements,  including  ehmination  of  the  behavior  of 
surge,  coupled  with  rotating  stall.  This  is  a  result  significant  because  in  engineering  practice, 
rotating  stall  and  surge  are  often  couple  that  is  called  classic  surge. 

The  success  of  this  research  program  is  inseparable  from  the  control  group  in  Wright-Patterson 
Air  Force  Base  (WPAFB),  led  by  Dr.  Siva  Banda.  In  fact  almost  afl  the  results  summarized  in 
this  report  are  the  consequences  of  collaboration  with  the  control  group  in  WPAFB,  including 
Dr.  Andy  Sparks,  and  Dr.  Siva  Banda,  and  Mr.  Paul  Blue.  Hence  we  are  extremely  grateful  to 
the  control  group  of  Dr.  Banda,  and  looking  forward  for  further  collaboration  in  the  near  future. 

4  Conclusion  and  Future  Research 

In  the  past  year,  our  research  work  has  focused  on  three  state  Moore- Greitzer  model.  The  results 
on  stabilization  of  rotating  stall  and  surge  control  are  established  for  the  low  order  compressor 
model  at  MIT.  Due  to  the  time  constraint,  the  proposed  work  for  high  order  compressor  model  is 
not  investigated,  though  our  results  reported  in  [16,  17,  9,  35,  21]  are  validated  with  high  order 
“distributed”  model.  This  will  be  studied  in  the  future  research  work.  In  particular,  compressor 
control  systems  using  air  jet  as  actuators  and  pressure  transducers  as  measurement  sensors  will 
be  the  emphasis  of  future  research  on  compressor  control.  Moreover  ?foo  optimization  will  be 
introduced  for  compressor  control  to  improve  compressor  performance.  We  are  confident  in  that 
with  the  leadship  of  Dr.  Siva  Banda,  we  will  make  further  contributions  to  the  DoD  mission 
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SCALEABLE  PARALLEL  PROCESSING 
FOR  REAL-TIME  RULE-BASED  DECISION  AIDS 


Chun-Shin  Lin 
Associate  Professor 
Department  of  Electrical  Engineering 
University  of  Missouri-Columbia 

Abstract 


The  project  was  an  extension  of  the  study  on  parallel  processing  for  decision  aids  performed  by 
the  principal  investigator  during  the  summer  of  1995  at  the  Air  Force  Wright  Laboratory.  The  rapid 
technology  development  in  the  past  two  decades  has  made  modem  combat  a  complicated  task.  A  large 
amount  of  information  can  be  available  in  a  mission  from  both  on-board  and  off-board  sources.  Effectively 
utilizing  the  information  is  necessary  to  achieve  successfiil  and  optimal  results.  This  project  extended  the 
previous  study  and  investigated  three  issues  relevant  to  parallel  processing  for  decision  aids.  The  first  one 
was  on  friult  tolerance.  The  previous  study  proposed  a  parallel  processing  technique  for  two-state  rale- 
based  decision  aids.  Each  subtask  was  exclusively  assigned  to  a  processing  node.  In  this  extension  study, 
we  examined  a  modified  scheme  that  assigned  each  subtask  to  multiple  processing  nodes  in  order  to  create 
&ult  tolerance.  The  second  part  of  the  study  was  on  parallel  neurocomputing.  Neural  networks  have 
become  an  important  part  of  decision  aids/intelligent  systems.  Basis  function  networks  were  considered  in 
this  study.  Basis  function  imits  were  grouped  into  subsets  and  assigned  to  different  processors.  The 
PARAGON  computer  has  been  used  for  experiments.  The  speed-up  is  excellent  when  the  network  size  is 
big.  This  is  because  the  amount  of  time  for  inter-processor  commimication  is  relatively  small  compared  to 
that  for  basis  function  computation.  Results  show  that  the  architecture  of  mesh  processors  is  good  for 
implementation  of  large  basis  function  networks.  In  the  third  part  of  the  report,  we  discuss  automatic  task 
decomposition  for  scaleability.  With  the  scaleabiUty,  a  decision  aid/intelligent  system  could  fully  utilize 
available  processing  resources.  Automatic  task  assignment  can  help  efficiently  reconfigure  the  system. 
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1.  INTRODUCTION 


Background 

The  project  is  an  extension  of  the  study  on  parallel  processing  for  decision  aids  [1]  performed  by 
the  principal  investigator  during  the  summer  of  1995  at  the  Air  Force  Wright  Laboratory.  The  rapid 
technology  development  in  the  past  two  decades  has  made  modem  combat  a  comphcated  task.  A  large 
amount  of  information  can  be  available  in  a  mission  from  both  on-board  and  off-board  sources.  Effectively 
utilizing  the  information  is  necessary  to  achieve  successful  and  optimal  results.  Decision  aids  that  operate 
in  real-time  are  an  important  issue  as  all  DoD  components  strive  to  reduce  the  crew  size  of  their  various 
we^ron  systems.  The  decision  aids  will  help  reduce  the  workload  of  the  crew  and  increase  the  efficiency 
and  reliabihty  of  operations.  Since  a  large  number  of  criteria  and  mles  must  be  evaluated  and  checked  in  a 
very  short  time  period  in  combat  automation,  parallel  processing  has  been  suggested  to  meet  the  timing 
requirement  [1-3]. 

An  intelligent  decision  aid  that  employs  a  two-state  mle-based  scheme  may  consist  of  four  major 
portions  [2,3]:  (1)  information  collection,  (2)  information  processing  and  criterion  evaluation,  (3)  mle 
checking,  and  (4)  action  execution  (see  Figure  1).  The  information  is  collected  by  sensors.  The 
information  processing  may  involve  conventional  conqrutation  and  algorithms,  as  well  as  nemocomputing 
and  frizzy  logic.  Rule-checking  will  take  the  criteria  values  (binary)  and  determine  which  rules  should  be 
fired.  Actions  that  are  associated  with  the  fired  rules  will  then  be  executed. 


Figure  1.  Basic  components  in  a  two-state  rule-based  decision  aid 
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The  Extension  Research 


This  extension  project  investigated  three  issues.  A  brief  description  is  given  below  with  details 
discussed  in  Sections  3,  4  and  5. 

1.  Fault  tolerance 

The  previous  study  [1]  proposed  a  parallel  processing  technique  for  two-state  rule-based  decision  aids. 
Each  subtask  was  exclusively  assigned  to  a  processing  node.  In  this  extension  study,  we  examined  a 
modified  scheme  that  assigns  each  subtask  to  multiple  processing  nodes  in  order  to  create  feult 
tolerance,  which  is  important  for  reliability. 

2.  Parallel  neurocomputing 

Neural  networks  are  becoming  an  important  part  of  decision  aids/intelligent  systems.  In  this  study,  we 
considered  the  implementation  of  basis  function  network  computation  using  an  architecture  of  mesh 
processors.  Basis  function  units  were  grouped  into  subsets  and  assigned  to  different  processors.  The 
PARAGON  computer  [4]  has  been  used  for  evaluating  the  speedup  factor. 

3 .  Automatic  task  decomposition  for  scaleability 

Scaleability  is  a  desired  feature.  With  the  scaleability,  a  decision  aid/intelligent  system  could  fully 
utilize  available  processing  resources.  Automatic  task  decomposition  is  needed  to  efficiently 
reconfigure  the  system.  In  Figure  1,  each  stage  can  consist  of  a  large  amount  of  small  procedures.  For 
instance,  evaluating  the  time  derivative  of  a  sensed  value  or  checking  whether  a  rule  should  be  fired  can 
be  viewed  as  basic  procedures.  In  this  report,  we  will  discuss  some  potential  techniques  for 
automatically  assigning  tasks  to  available  processing  units. 

2.  THE  INTELLIGENT  DECISION  AID 

The  basic  block  diagram  of  an  intelligent  decision-aid  has  been  given  in  Figure  1.  Information  is 
collected  by  sensors.  The  information  sensed  may  be  preprocessed  to  generate  the  derived  data  for  the 
criterion  evaluator.  One  simple  example  of  derived  data  is  the  rate  of  change  of  a  sensed  value.  Neural 
networks  may  be  used  in  preprocessing  too.  The  outputs  of  the  criterion  evaluator  are  binary  values  (two- 
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state).  A  criterion  indicates  whether  a  special  condition  or  a  sequence  of  conditions  are  satisfied  or  not. 
One  example  is  that  a  specific  voltage  value  has  been  kept  over  5V  in  the  past  three  time  intervals.  Criteria 
are  inputs  to  the  rule-checking  module. 

Ci  is  used  to  denote  the  fth  criterion,  which  has  a  value  of  either  0  or  1.  ~Ci  denotes  the 
complement  of  Ci.  A  rule  is  represented  as  a  logic  minterm  (AND  of  Boolean  variables)  [2,3].  For 
instance, 

Rk :  (action  hst)  <-  C2  &  -Cs  &  C12 

where  denotes  the  logic  AND.  The  above  rule  is  fired  when  C2  =  1,  C5  =  0  and  C12  =  1 .  When  the  rule 
Rk  is  fired,  actions  in  the  action  list  will  be  executed.  Displaying  a  piece  of  information  to  the  pilot, 
recommending  an  action  or  even  taking  over  part  of  a  pilot’s  tasks  are  examples  of  actions. 

The  Parallel  Processing  System 

This  study  on  parallel  processing  assumes  a  2-dimensional  mesh  processor  architecture.  Each 
processor  (a  node)  can  execute  its  own  program  and  communicate  with  others  through  some  SEND  and 
RECEIVE  commands.  The  Intel  Paragon  Computer  [4]  available  in  Wright  Laboratory  for  defense 
research  studies  belongs  to  this  type.  This  Paragon  consists  of  352  general-purpose  nodes  called  GP  nodes. 
Each  GP  node  has  a  single  i860  XP  application  processor,  as  well  as  an  additional  i860  XP  as  a  message 
processor  for  message  operations.  When  an  appUcation  decides  to  send  a  message,  the  message  processor 
handles  the  work  and  firees  the  appUcation  processor  to  continue  with  munerical  computing.  Each  GP  node 
has  its  own  32  Mbytes  of  memory.  The  computer  system  is  scaleable  and  can  be  easily  expanded  by 
adding  new  nodes.  Since  the  computer  is  a  multi-user  system,  interference  between  different  processes  may 
exist  due  to  data  transmission. 

It  is  noted  that  the  Intel  i860  and  i960  are  usai  on  today’s  miUtary  aircraft.  Thus  the  results  from 
the  study  using  the  Paragon  are  more  easily  transferable  to  practical  use  in  operational  environments. 

With  this  kind  of  structure,  one  can  decompose  the  rule-checking  task  into  p  processors  and  have 
each  processor  evaluate  the  assigned  rules.  The  selection  of  p  should  be  based  on  the  availabiUty  of  nodes 
and  the  processing  load. 
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Knowledge  Representation  in  Scaleable  Two-State  Rule-based  Systems 

As  introduce  earlier,  the  rule-based  system  will  have  the  rules  represented  in  the  form 


Rk  :  (action  list)  <-  Ci  &  -Q  &  Cm 


The  data  structure  must  indicate  which  criteria  are  included  in  each  rule.  The  data  structure  is  illustrated  in 
Figure  2.  A  list  of  criteria,  called  LIST_CRJTERIA,  used  by  all  rules  is  constructed.  A  number  /  in  the 
list  indicates  that  Q  is  included  and  -i  indicates  that  -Q  is  included.  Another  Ust  RULE_POINTER  stores 
the  positions  of  the  last  criteria  of  all  rules  (see  Figure  2)  [1-2].  For  example,  RllLE_POINTER[j]  is  the 
pointer  to  the  last  criterion  used  by  the  rule  Rj.  If  o  =  RULE_POINTER[j-l]  and  6  = 
RULE_POINTER[j],  then  the  rule  Rj  uses  the  criteria  denoted  by  LIST_CRITERIA[fl+l], 
LIST_CR1  rERIA[a+2],  ....  LIST_CRlTERIA[f>] .  Note  that  h,  p  and  q  in  the  figure  are  either  positive 
or  negative  criteria  numbers. 


I - list  for  rule  i - 1 


element  position 

1 

2 

3 

a  0+1 

b 

LIST_CRITERIA 

B 

q 

mm 

J. 

□L_, 

,  r 

j 

1 

HI 

RULE_POINTER 

1  * 

element  position  (=  rule  number)  j-l  / 


Figure  2.  Data  structure  for  rule  base. 

The  system  will  have  a  rule  fired  at  the  time  when  all  involved  criteria  become  true.  The  rule  will  be  kept 
at  a  fired  status  until  one  or  more  involved  criteria  become  unsatisfied.  At  any  time,  only  the  rules 
involving  the  criteria  that  change  values  need  to  be  checked.  Thus  it  is  more  efficient  to  construct  a  data 
structure  to  make  it  easier  to  find  the  set  of  rules  that  need  to  be  checked.  This  means  backward  pointers 
from  criteria  to  rules  are  needed.  This  requires  the  construction  of  a  data  structure  similar  to  the  one 
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above.  Figure  3  shows  such  an  index  structure  for  backward  pointers.  CRrrERION_POINTER  provides 
information  for  quickly  determining  which  subset  of  rules  in  the  hst  LIST_RULES  should  be  evaluated. 


I — hst  for  criterion  /  — | 


element  position  (=  criterion  number)  /-I  i 


Figure  3.  An  index  structure  denoting  vdiich  rules  are  used  by  a  criterion 

Note  that  the  first  data  structure  consists  of  complete  knowledge  and  the  second  one  can  be  derived  fi-om  it. 

Rule-Base  Knowledge  for  Each  Processing  Node 

In  the  previous  study  [3],  each  node  checked  only  a  subset  of  rules.  Thus  it  didn’t  need  the 
complete  rule-base  knowledge.  The  data  structure  for  the  subset  of  rules  can  be  represented  in  a  similar 
structure  as  that  for  the  overall  knowledge  shown  in  Figures  2  and  3.  However,  only  a  subhst  fi-om  the 
LIST_CRITERIA  and  a  sublist  fiom  RULE_POINTER  will  be  stored  for  each  processor.  In  the  index 
data  structure,  the  length  of  CRlTERION_POINTER  will  remain  the  same  but  the  rules  in  RULE  LIST 
not  handled  by  the  assigned  node  will  be  removed.  The  subsets  of  rule  bases  can  be  generated  fiom  the 
overall  rule  base  by  a  computer  program. 


3.  ASSIGNING  EACH  RULE  SUBSET  TO  MULTIPLE  PROCESSORS 

The  scheme  previously  investigated  decomposed  the  rule  base  into  subsets  and  assigned  one 
exclusive  node  to  check  a  subset  of  rules.  If  a  processing  node  fails,  all  rules  assigned  to  it  would  not  be 
checked. 

To  create  the  fi.ult  tolerance,  one  can  divide  the  rule-checking  task  into  smaller  subtasks  and  have 

each  subtask  assigned  to  at  least  two  processing  nodes.  Figure  4  illustrates  the  idea.  The  rules  are  divided 
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into  smaller  subsets  and  each  processing  node  covers  a  larger  set  of  rules  (with  overlapping).  Each  subset 
can  now  be  checked  by  one  of  two  or  more  processors.  Although  a  subset  is  assigned  to  two  or  more 
nodes,  it  will  be  checked  by  only  one  processor,  the  one  that  becomes  available  first.  With  the  modified 
arrangement,  failure  of  nodes  that  are  not  assigned  common  subsets  will  not  fail  the  system. 


subsets:  III 

I  I 

node  assignment:  ...  n-1  n  n+1  ... 

Figure  4.  Decomposition  of  rules  and  assignment  to  nodes 

We  have  performed  experiments  for  cases  with  two  processing  nodes  assigned  to  each  subset  of 
rules.  Rules  assigned  to  each  processing  node  were  divided  into  2  or  more  subsets.  Figure  5  gives 
examples  of  different  arrangements.  Figure  5(a)  shows  an  arrangement  for  3  processors  which  are  each 
assigned  2  subsets.  Note  that  each  subset  is  covered  by  two  processors.  Each  processing  node  will  start 
fi'om  the  middle  subset.  For  example,  for  the  4  processors/4  subsets  case  in  Figure  5(g),  processing  node  2 
will  check  subset  4  first,  then  5,  3  and  6.  Before  checking  a  subset,  a  processor  probes  its  neighbors  that 
are  also  assigned  the  same  subset  to  see  if  any  neighbor  has  checked  it.  If  not,  the  processing  node  will  send 
a  message  to  those  neighbors  and  check  the  subset.  If  yes,  the  processor  will  skip  this  subset  and  try  to 
check  others. 
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Subsets 

Subset  1 

Subset  2  Subset  3 

Processing 

1  Processor  (1)  |  Processor  (3) 

nodes 

Processor  (3) 

1  Processor  (2)  | 

(a)  3  nodes,  2  subsets  per  node. 

Subsets 

1 

1  1  1 

Processing  | 

(1)  1  (3) 

nodes 

(3)  1 

(2) 

(b)  3  nodes,  3  subsets  per  node. 


Subsets 

Processing  |_ 

(1) 

1 

(3) 

nodes 

(3) 

_J _ 

(2) 

1 

(c)  3  nodes,  4  subsets  per  node. 


Subsets  1 

Processing 

1 

(1) 

1 

(3) 

nodes 

(3) 

J _ 

(2) 

_ 1 

(d)  3  nodes,  5  subsets  per  node. 


Figures,  (to  be  continued) 
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Subsets 


Processing  [ 
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(3) 

nodes 

(4) 

_] _ 

(2) 

_ l_ 

(4) 

(e)  4  nodes,  2  subsets  per  node. 


Subsets 


Processing 

1  (1) 

1 

(3)  1 

nodes 

(4)  1 

(2) 

1  (4) 

(f)  4  nodes,  3  subsets  per  node. 

Subsets  1 

I..M  ^  1 

3  1  4  1  5 

1  6  1  7  1  8  1 

Processing  | 

1 

1 

(3)  1 

nodes 

(4)  1 

(2) 

1  (4) 

(g)  4  nodes,  4  subsets  per  node. 


Subsets 

Processing 

nodes 


1 

(1) 

1 

(3) 

1 

(4) 

J _ 

(2) 

_J _ 

(4) 

(h)  4  nodes,  5  subsets  per  node. 


Figure  5.  Examples  of  arrangements  using  different  numbers  of  subsets  and  processing  nodes. 
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Experiments  have  been  performed  to  evaluate  the  performance  for  the  cases  with  3  to  9  processors  and  2  to 
6  subsets  per  processor.  Two  different  sets  of  rule  bases  have  been  used.  These  two  sets  are  31508  and 
64010  generated  using  the  following  specifications: 

3 1508:  3000  rules,  1500  different  criteria,  at  most  8  criteria  in  each  rule. 

64010:  6000  rules,  4000  different  criteria,  at  most  10  criteria  in  each  rule. 

The  performance  for  each  arrangement  is  compared  to  the  one  using  a  single  processor.  Speed-up  that  is 
defined  as 

Speed-up  =  time  to  execute  on  one  processor  /  time  to  execute  on  p  processors 

is  obtained.  Curves  of  Speed-up  are  plotted  in  Figure  6. 

The  Speed-up  fiictor  for  the  case  with  three  processing  nodes  is  around  1  to  1.6.  With  the  number 
of  nodes  doubled,  the  Speed-up  can  be  increased  to  about  1.4  to  2.3.  The  improvement  in  the  processing 
speed  is  less  significant  because  of  the  heavy  overhead  in  the  necessary  inter-processor  commimication. 
Unless  the  rule  set  is  very  large  such  that  the  communication  time  is  relatively  small  compared  to  the  time 
for  rule  checking,  the  merit  on  speed  improvement  will  not  be  very  significant  and  the  main  benefit  will  be 
feult  tolerance. 

We  have  also  evaluated  the  performance  assuming  that  one  processor  fails.  The  processing  time  is 
plotted  in  Figure  7.  The  execution  time  using  a  single  processor  is  0.70528  seconds  for  the  set  3 1508  and 
0.83124  seconds  for  the  set  64010. 
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speedup 


the  number  of  processors 
(a)  Speedup  for  the  data  set  31508. 


the  number  of  processors 


(b)  Speedup  for  the  data  set  64010, 


Figure  6.  Speedup  for  different  arrangements. 
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(a)  Execution  time  for  the  data  set  31508. 
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(b)  Execution  time  for  the  data  set  64010. 


Figure  7.  Execution  time  for  different  arrangements. 
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4.  PARALLEL  PROCESSING  FOR  NEUROCOMPUTING 

Efforts  have  been  also  devoted  to  parallel  processing  of  neural  nets,  which  have  become  important 
components  in  decision  aids/intelligent  systems.  The  similar  idea  introduced  in  Section  2  for  rule  checking 
can  be  applied  to  parallelizing  the  neurocomputing.  Gaussian  function  based  neural  networks  are 
considered.  The  basis  function  units  are  grouped  into  small  subsets.  The  performance  has  been  evaluated 
for  different  sizes  of  neural  networks  and  the  different  numbers  of  processors. 

Basis  Function  Network 

Figure  8  shows  a  basis  function  network  (BFN).  Each  row  is  a  basis  function  unit.  Notation  b 
denotes  the  global  bias,  y/  is  the  basis  function  and  Wk  is  the  weight.  The  input  vector  entering  the  ifcth 
neuron  is  translated  by  tk,  rotated  by  Rk,  and  then  scaled  by  Dk.  Dk  is  a  diagonal  matrix,  of  which  each 
diagonal  element  is  a  scaling  factor.  The  transformed  vector  is  used  as  the  input  to  the  basis  function  y/. 


X 

P 


n 


9r^(Xp) 


Figure  8.  RBFN  and  its  learning  parameters. 
The  network  output  is  computed  as 


k=I 


Learning  tries  to  minimize  the  following  cost  function: 


(1) 


9-14 


(2) 


c(^,Xp.yp)  =  |[go(Xp)-ypf  • 

The  learning  rules  for  development  of  a  desired  functional  approximation  or  mapping  can  be  found  from 
[5]. 

Parallel  Processing  for  BFN  Usim  the  PARAGON  Architecture 

In  the  parallel  processing,  processing  node  0  is  assumed  to  get  the  input  vector,  send  it  to 
processing  nodes  1  through  p,  receive  the  computational  results  from  them,  and  compute  the  final  output 
vector.  Each  of  the  processing  nodes  1  through  p  is  assigned  to  handle  the  computation  for  a  subset  of 
basis  function  units.  The  structure  is  shown  in  Figure  9,  in  which  each  row  in  Figure  8  is  denoted  by  a 
circle. 

Experiments  have  been  performed  for  Gaussian  basis  function  networks  ynihfour  inputs  and  four 
outputs,  and  for  those  with  eight  inputs  and  eight  outputs.  The  speed-up  factor  has  been  evaluated.  The 
results  are  shown  in  Figure  10.  Curves  in  the  two  figures  are  for  different  numbers  of  basis  units.  The 
results  apparently  show  that  the  efficiency  increases  when  the  network  size  becomes  large.  For  big 
networks,  the  overhead  in  inter-processor  communication  is  relatively  in.cignififant 
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Figure  9.  The  structure  for  parallel  processing  for  BFN. 
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Speedup  Speedup 


the  number  of  processors  #  of  BF  units 


(a)  Speedup  vs.  The  number  of  processors  (4  inputs  and  4  outputs). 


1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16 

|-“0— 8192 

the  number  of  processors  #  of  BF  units 


(b)  Speedup  vs.  the  number  of  processors  (8  inputs  and  8  outputs). 
Figure  10.  Speedup  for  different  arrangements. 
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5.  AUTOMATIC  TASK  DECOMPOSITION  FOR  SCALEABILITY 


Scaleability  is  a  desired  feature.  It  will  be  necessary  to  reconfigure  the  system  if  there  are  not 
enough  processing  resources  or  part  of  the  decision  aid  has  been  modified.  Automatic  task  decomposition 
is  needed  to  efficiently  reconfigure  the  system.  Each  stage  in  Figure  1  is  assumed  to  include  a  large  amoimt 
of  small  procedures.  These  procedures  should  be  grouped  into  subsets  and  assigned  to  available  nodes.  To 
be  able  to  perform  the  task  decomposition  work,  information  regarding  inputs,  outputs  and  the  execution 
time  of  each  procedure  (as  shown  in  Figure  1 1)  is  required. 

stage  1  Stage  2  Stage  3  . 


Pii:  (input  list),  (output  list),  (execution  time).  Pji:  (input  list),  (output  list),  (execution  time) 
Pij:  (input  list),  (output  list),  (execution  time).  Pj::  (input  list),  (output  list),  (execution  time) 


(Pij’s  are  procedures) 


Figure  11.  Information  for  procedures. 

When  a  modification  on  the  decision  aid  is  made  (add/delete/modify  a  procedure),  only  related  parts  need  to 
be  updated.  The  information  given  in  Figure  1 1  implicitly  specifies  a  tree  structure  of  task  dependence. 
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Task  Scheduling 


With  the  information  in  Figure  1 1  available,  a  procedure  can  be  devised  to  assign  tasks  to  available 
processing  nodes.  The  goal  is  to  minimize  the  processing  time  with  available  resources.  A  possible 
procedure  for  task  scheduling  [6,Ch2]  uses  the  following  rule: 

The  number  of  successors  of  a  subtask  is  used  as  its  priority.  Whenever  a  processing  node 
becomes  available,  the  unexecuted  ready  subtask  with  the  highest  priority  will  be  assigned. 

Figure  12  shows  an  example  of  task  trees  and  the  results  of  task  assignments.  Two  numbers  in 
each  node  in  the  task  trees  denote  the  subtask  number  and  the  needed  processing  time.  Subtasks  in  two 
trees  are  to  be  assigned  to  two  processors.  Based  on  the  rule  given  above,  subtask  1  with  two  successors  is 
assigned  to  processor  1,  and  subtask  5  is  assigned  to  processor  2  since  it  is  the  only  ready  task.  Subtask  1 
is  finished  earlier.  Subtask  3  with  one  successor  will  then  be  assigned  to  processor  1  (after  subtask  1). 
The  assignment  will  continue  until  all  subtasks  are  assigned. 
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Level 


Tree  1 


Tree  2 


Gantt  Chart: 

0 

5 

15 


27 


PI  P2 
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15 

25 


Figure  12.  An  example  of  task  scheduling. 

Clusterins  Technique 

Necessary  inter-communication  time  is  not  considered  in  the  above  scheme.  Subtask  6  will  need 
the  result  from  subtask  5  but  is  assigned  to  a  different  processor.  Communication  time  will  then  be  needed. 
The  problem  could  be  solved  by  applying  a  clustering  algorithm  that  groups  subtasks,  and  makes 
them  executed  on  the  same  processing  node.  The  purpose  is  to  reduce  the  communication  time. 
The  flow  chart  in  Figure  13  shows  the  procedure  [6,Ch6].  The  procedure  first  initializes  all  links 
in  the  task  tree  to  be  unmarked  and  makes  each  subtask  a  cluster.  It  then  finds  the  longest  path 
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composed  of  unmarked  links.  The  subtask  nodes  in  the  path  are  grouped  into  a  cluster,  and  the 
links  are  marked  and  link  costs  are  zeroed.  The  procedure  is  repeated  until  all  links  are  marked. 


Figure  13.  The  flow  chart  for  the  clustering  algorithm 
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Figure  14  shows  one  example  applying  the  clustering  algorithm.  The  time  for  inter-processor 
communication  will  be  reduced  if  a  cluster  of  subtasks  is  assigned  to  the  same  processor.  A  task 
scheduling  algorithm  should  assign  tasks  in  the  same  cluster  to  the  same  processing  node. 


Figure  14.  An  example  after  applying  the  clustering  algorithm  (after  [6]), 
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6.  CONCLUSION 


In  this  project,  we  investigated  parallel  processing  techniques  for  possible  usage  in  decision  aids. 
In  the  first  part,  we  evaluated  the  performance  of  paraUel  processing  for  two-state  rule-based  decision  aids. 
Each  subset  of  rules  was  assigned  to  multiple  processors  in  order  to  have  fault  tolerance.  Due  to  the  inter¬ 
processor  commumcation  time,  the  improvement  in  speed  was  not  very  significant  as  expected.  Fault 
tolerance  is  one  merit  with  such  an  arrangement.  In  the  second  part,  we  used  the  PARAGON  computer  for 
parallel  computation  for  basis  function  networks.  The  speed-up  is  excellent  if  the  network  size  is  large. 
This  is  because  the  amount  of  time  for  inter-processor  communication  is  relatively  small  compared  to  that 
for  basis  function  computation.  Results  show  that  the  architecture  of  mesh  processors  is  good  for 
implementation  of  large  basis  flmction  networks.  In  the  third  part,  we  discussed  techniques  for  scheduling 
tasks  to  multiple  processing  nodes.  The  clustering  algorithm  is  a  techmque  that  can  help  reduce  the  inter¬ 
processor  communication. 
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EXPERIMENTAL  AND  COMPUTATIONAL  INVESTIGATIONS  OF  BROMINE  AND 
IODINE  CHEMISTRY  IN  FLAME  SUPPRESSION 


Paul  Marshall 
Associate  Professor 
Department  of  Chemistry 
University  of  North  Texas 
PO  Box  5068,  Denton,  Texas  76203-0068 


Abstract 

Rate  constants  for  the  reaction  of  H  atoms  with  the  alkyl  iodides  iodomethane  (1),  iodoethane 
(2),  2-iodopropane  (3)  and  2-iodo-2-methyl  propane  (4)  have  been  measured  using  the  flash- 
photolysis  resonance  fluorescence  technique.  The  resuts  are  kj  =  (6.8  ±  0.3)  x  10’"  exp((-5.4  ±  0.1) 
kJ  mor'/RT)  (T  =  297-757  K),  kj  =  (1. 1  ±  0.2)  x  lO’’®  exp((-5.9  ±  0.8)  kJ  mof’/RT)  (T  =  295-624 
K),  kj  =  1.4  X  10'"  (T  =  295  K)  and  k4  =  2.0  x  10'"  (T  =  294  K)  cm^  molecule"^  s'*.  Estimated 
accuracies  are  discussed  in  the  text.  The  transition  state  for  H  +  CH3I  I  +  CH^  was  characterized 
at  the  Gaussian-2  level  of  ab  initio  theory,  and  substitution  was  shown  to  be  a  slow  process.  H-atom 
abstraction  is  also  argued  to  be  slow,  and  the  dominant  pathway  for  H  +  iodomethane  reactions  is 
suggested  to  be  I-atom  abstraction  leading  to  HI  formation. 
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EXPERIMENTAL  AND  COMPUTATIONAL  INVESTIGATIONS  OF  BROMINE  AND 
IODINE  CHEMISTRY  IN  FLAME  SUPPRESSION 


Paul  Marshal] 


Introduction 

There  is  growing  interest  in  the  combustion  chemistry  of  iodine  compounds,  arising  from  the 
search  for  substitutes  for  the  halon  fire  extinguishing  agents  CFjBr  and  CFjClBr.^  Halon  production 
is  banned  under  the  Montreal  Protocols  on  Substances  that  Deplete  the  Ozone  Layer.  CF3I  is  a 
potential  candidate  for  service  as  a  new  fire  suppressant,  but  there  is  a  lack  of  kinetic  information  for 
iodine  reactions,  especially  at  elevated  temperatures.  Modeling  of  the  radical  inhibition  chemistry  of 
CF3I  suggests  that  significant  destruction  of  H-atom  chain  carriers  occurs  via  CH3I  formed  in  flames 
from  CH3  + 1  recombination,  followed  by  reaction  of  CH3I  with  H  atoms.^  The  reaction 

CH3I  +  H  ^  products  (1) 

has  been  argued  to  be  the  dominant  process  for  CH3I  consumption  in  a  stoichiometric  CH^air  flame.^ 
To  date  there  have  been  several  studies  of  reaction  1  at  room  temperature,^*^’^’  but  the  temperature 
dependence  of  the  rate  constant  kj  has  not  been  measured.  Based  on  the  thermochemistry  of  CH2I* 
and  other  species^  there  are  three  exothermic  product  channels; 

CH3I  +  H  -  CH3  +  HI  AHjpg  = -61  kJ  mol*  (la) 

CH2I  +  H2  AH25g  = -3  kJ  mol-'  (lb) 

CH4  +  I  AH29g  =  -201  kJmor*  (Ic) 

The  transition  state  for  la  has  been  characterized  computationally  by  Schiesser  et  al.,*°  while  Marshall 
et  al.^  have  analyzed  the  transition  states  for  channels  la  and  lb  using  the  Gaussian-2  methodology 
of  Pople  and  coworkers,**  as  extended  to  iodine  compounds  by  Glukhovtsev  et  al.,*^  and  derived 
high-temperature  ab  initio  rate  constants  and  branching  ratios  for  H  vs  I  abstraction.  In  the  present 
work  the  ab  initio  analysis  is  extended  to  the  dispacement  channel  Ic. 

There  has  been  a  single  study*^  of 

C2H5I  +  H  ->  products  (2) 

which  yielded  an  experimental  room  temperature  value  of  k2  about  two  orders  of  magnitude  smaller 
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than  literature  values  for  kj.  The  present  work  describes  the  first  measurements  of  the  temperature 
dependences  of  kj  and  kj,  and  resolves  the  discrepancy.  Structural  factors  that  contribute  to  the 
reactivity  of  iodoalkanes  are  also  considered,  and  the  reactivities  of  primary,  secondary  and  tertiary 
C-I  bonds  are  compared  through  room  temperature  measurements  of 

CH3CHICH3  +  H  -  products  (3) 

for  which  there  is  one  prior  determination*^  and 

(CH3)3CI  +  H  -  products  (4) 

which  appears  not  to  have  been  studied  previously.  Likely  products  are  discussed,  and  the  results  are 
compared  with  ab  initio  information  about  channels  la-lc. 

Experimental  method 

The  experimental  apparatus  and  modifications  for  H-atom  kinetics  have  been  described 
elsewhere. Briefly,  atomic  H  was  generated  by  pulsed  flash  lamp  photolysis  of  NH3,  through 
MgFj  optics,  in  the  presence  of  a  large  excess  of  iodoalkane.  All  experiments  were  carried  in  Ar  bath 
gas  at  a  total  pressure  P.  The  concentration  of  H  was  monitored  using  time-resolved  resonance 
fluorescence  at  a  wavelength  of  121.6  nm.  Fluorescence  was  detected  with  a  solar-blind 
photomultiplier  tube  employed  with  pulse  counting  and  signal  averaging.  Under  the  pseudo-first-order 
conditions  and  fixed  [NH3], 

d[H]/dt  =  -(kx[X]  +  k^[H]  =  -kp„[H]  (5) 

where  X  is  an  iodoalkane  and  k^  accounts  for  loss  of  H  atoms  out  of  the  reaction  zone  other  than 
by  reaction  with  X  (mainly  by  diffusion  to  the  reactor  walls),  kp,,  was  obtained  by  fitting  the  observed 
fluorescence  intensity  Inversus  time  profiles  to  an  exponential  decay  (an  example  is  shown  as  the  inset 
on  Fig.  1),  and  the  second-order  H  +  X  rate  constant  kx  was  found  fi-om  linear  plots  of  kp,i  versus 
typically  five  values  of  pC],  from  0  to  (see  Fig.  1).  The  temperature  T  in  the  reaction  zone  was 
monitored  with  a  thermocouple,  corrected  for  radiation  errors,  before  and  after  each  set  of  kx 
measurements.  The  average  residence  time  of  gas  mbctures  in  the  heated  reactor  before  photolysis, 
was  varied  to  check  for  possible  pyrolysis  of  the  iodoalkanes,  while  the  energy  disharged  through 
the  flash  lamp,  F,  was  varied  to  alter  the  intial  radical  concentrations. 

The  Ar  (Air  Products,  99.997%)  was  used  directly  fi-om  the  cylinder  and  NH3  (MG  Industries, 
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99.99%)  was  purified  by  freeze-pump-thaw  cycles  from  77  K.  The  iodoalkanes  (CH3I,  Aldrich  99% 
and  99.5%;  CjHjI,  Lancaster,  99%;  CH3CHICH3,  Aldrich,  99%;  (CH3)3CI,  Aldrich,  95%))  were 
purified  by  distillation,  from  273  K  for  the  first  two  reagents  and  from  room  temperature  for  the  latter 
two,  and  condensed  at  77  K,  to  remove  any  iodine  contamination. 

Results 

The  experimental  conditions  and  results  for  kj,  kj,  kj  and  k4  are  summarized  in  Tables  1-3. 
The  ki  results  were  independent  of  the  two  different  lots  of  CH3I  employed.  The  lack  of  dependence 
of  kx  on  F  shows  that  secondary  chemistry  involving  photolysis  or  reaction  products  was  negligible, 
and  the  lack  of  dependence  of  kj  and  kj  on  shows  that  pyrolysis  of  CH3I  and  CjHjI  was 
unimportant  at  the  listed  temperatures.  Data  for  reactions  1  and  2  at  higher  temperatures  did  show 
consistent  variation  with  x^^,  and  for  reaction  2  showed  a  decrease  in  the  apparent  kj  above  630  K, 
and  therefore  were  excluded  from  further  analysis. 

The  Arrhenius  plot  for  reaction  1  is  shown  in  Fig.  2,  and  the  weighted  linear  fit  yields 

ki  =  (6.8  ±  0.3)  X  10'"  exp((-5.4  ±  0. 1)  kJ  mol'^/RT)  cm^  molecule'^  s'*  (6) 

The  quoted  errors  in  the  Arrhenius  parameters  are  lo  and  are  statistical  only.  Consideration  of  the 
covariance  leads  to  a  lo  precision  for  the  fitted  kj  of  1-2  %,  and  allowance  for  possible  systematic 
errors  leads  to  95%  confidence  intervals  of  ±10%.  The  Arrhenius  plot  for  reaction  2  is  shown  in  Fig. 
2,  and  the  weighted  linear  fit  yields 

kj  =  (1.1  ±  0.2)  X  10'*°  exp((-5.9  ±  0.8)  kJ  mol'^/RT)  cm^  molecule'*  s'*  (7) 

Consideration  of  the  covariance  leads  to  a  lo  precision  for  the  fitted  k2  of  5-13%,  and  allowance  for 
possible  systematic  errors  leads  to  95%  confidence  intervals  of  ±28%  at  the  extremes  of  the 
experimental  T  range  to  ±14%  at  the  center.  Similar  accuracies,  about  ±20%,  are  expected  for 
reactions  3  and  4  which  were  studied  at  room  temperature  only  (see  Table  1). 

Discussion 

Our  kj  values  are  compared  with  four  previous  measurements  in  Fig.  3,  and  it  may  be  seen 
that  there  is  particularly  good  accord  with  the  most  recent  literature  value  ki(298  K)  =  (7.9  ±  0.8) 
X  10'*^  cm^  molecule'*  s'*.’  There  is  a  single  previous  determination  of  kj,  based  on  a  rate  measurement 
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relative  to 

H  +  HI  -  H2  +  I  (7) 

and  which  Rebbert  et  al.  reported  as  kj  =  7  x  cm^  molecule"*  This  is  a  factor  of  150  smaller 
than  measured  here.  Based  on  Sullivan’s  measurements  of  the  reverse  of  reaction  7,*’  Rebbert  et  al. 
used  k7  =  1.7  x  10"*^  cm^  molecule'*  s"*.*^  However,  Baulch  et  al.  used  Sullivan’s  data  and  the 
equilibrium  constant  for  reaction  7  to  obtain  k7  «  5  x  10"**  cm^  molecule"*  s'*  for  T  =  667-800  K,'*  a 
calculation  which  we  reproduce.  The  source  of  the  discrepancy  appears  to  be  mainly  that  an  incorrect 
value  of  k7  was  employed  earlier,*^  while  the  earlier  rate  constant  ratio  k2:k7  was  essentially  correct. 
The  same  explanation  accounts  for  much  of  the  discrepancy  between  the  value  of  kj  given  by  Rebbert 
et  al.,*^  1.6  X  10"*^  cm^  molecule"*  s'*,  and  our  own  direct  measurement  (Table  3)  which  is  90  times 
higher.  There  appear  to  be  no  literature  values  for  k4. 

As  noted  in  the  introduction,  there  are  three  possible  reaction  pathways  for  the  H  plus 
iodoalkane  reactions,  I-abstraction,  H-abstraction  and  I-substitution.  The  most  exothermic  process 
is  displacement  of  the  I  atom.  We  have  characterized  the  transition  state  for  this  process  at  the  HF/6- 
31G(d)  and  MP2=full/6-31G(d)  levels  of  theory,  and  the  geometry  is  shown  in  Fig.  4.  Higher  level 
single-point  energy  calculations  yielded  the  G2  energy  (see  Table  4)  which  approximates  a 
QCISD(T)/6-31 1-Kj(3df^2p)  result.**’*^  Computations  were  carried  out  with  the  Gaussian  94  program 
suite.*®  Relative  to  the  G2  energy  of  H  +  CHjI,*^  the  G2  barrier  to  substitution  at  0  K  is  predicted  to 
be  45  kJ  mol'*.  This  barrier  means  that  I-atom  displacement  is  kinetically  unfavorable.  Furthermore, 
the  transition  state  for  this  process  is  found  to  be  tight.  The  unsealed  MP2=tull/6-3  lG(d)  frequencies 
(Table  4)  and  the  geometry,  together  with  entropy  data  for  H  and  CHjI,^  lead  to  the  entropy  of 
activation  for  reaction  Ic  of  A  8^29$  ^  -91  J  K'*  mol"*  and  an  implied  preexponential  factor  at  298  K 
of  about  2  X  10'*^  cm^  molecule'*  s'*,  more  than  two  orders  of  magnitude  below  that  observed. 
Substitution  will  therefore  make  a  negligible  contribution  to  the  total  kj. 

An  approximate  idea  of  the  rate  constant  for  channel  lb  can  be  obtained  by  consideration  of 
the  analogous  reaction 

H  +  CH4  -  U2  +  CH3  (8) 

for  which  k*  is  approximately  7.4  x  10"*®  cm^  molecule'*  s'*  at  298  K.^’  This  is  about  10"’  of  kj  at  room 
temperature,  and  therefore  H  abstraction  plays  a  negligible  role  in  reaction  1.  This  is  a  reasonable 
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comparison,  bearing  in  mind  the  similar  C-H  bond  strengths*  in  CH4  and  CH3I  and,  further,  is  in 
accord  with  an  earlier  G2  ab  initio  analysis.^  In  summary,  kj  «  kj,.  A  similar  assessment  of  H- 
abstraction  can  be  made  for  the  heavier  iodoalkanes.  In  these  molecules  all  of  the  C-H  bonds  are 
primary,  and  the  rate  constant  for  H-abstraction  from  iodoalkanes  was  estimated  as  n/6  times  the  rate 
constant  for  H  +  (4.5  x  10'”  cm^  molecule'*  s'*  at  298  K),^*  where  n  is  the  number  of  C-H 

bonds.  The  results  are  shown  in  Table  5,  and  in  all  cases  the  H-abstraction  channel  is  minor. 

kx  at  298  K  and  the  C-I  bond  dissociation  enthalpy  DH29g  are  compared  in  Table  5.  DH29g 
values  were  derived  from  literature  enthalpies  of  formation  for  alkyl  radicals^  and  iodoalkanes  and 
I  atoms,®  and  have  uncertainties  of  about  2-3  kJ  mol'*.  There  is  a  monatonic  trend  for  kx  along  the 
series  CH3I  and  primary  to  tertiary.  There  is  also  a  rough  correlation  with  DHjj*,  where  within  the 
thermochemical  uncertainty  the  species  with  the  weakest  C-I  bonds  are  the  most  reactive. 

Finally,  we  note  that  there  is  reasonable  accord  between  the  ab  initio  kj^  and  that  measured 
here:  at  298  K  the  ab  initio  k,  expression  is  too  high,  relative  to  eq.  6,  by  a  factor  of  1 . 15,  which 
increases  to  a  fector  of  2.9  at  760  K.  As  seen  in  Fig.  3,  the  discrepancies  at  higher  temperatures  arise 
from  the  higher  curvature  predicted  for  the  Arrhenius  plot  of  k,.  A  possible  explanation  is  that 
variational  effects  are  important,  and  therefore  that  the  conventional  transition  state  theory  analysis 
overestimates  kj.^  Such  effects  are  generally  more  significant  for  reactions  with  smaller  energy 
barriers,  as  is  the  case  here. 

Conclusions 

The  temperature  dependences  of  the  rate  constants  for  reactions  of  H  atoms  with  CH3I  and 
C2H5I  have  been  measured  for  the  first  time,  and  room  temperature  rate  constants  for  H  + 
CH3CHICH3  and  H  +  (CH3)3CI  have  also  been  characterized.  The  results  are  consistent  with  I- 
abstraction  as  the  main  reaction  pathway,  and  any  contributions  from  I-substitution  and  H-abstraction 
are  small. 
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TABLE  1:  Rate  Constant  Measurements  of  the  Reaction  of  H  +  CHjL 


T,K 

P,  mbar 

F.J 

lo'*  molecule  cm** 

lo’*  molecule  cm  * 

kj  ±  0^1, 

10'"  cm’  molecule''  s"' 

297 

101.3 

2.1 

4.05 

1.87 

5.39 

0.76  ±  0.02 

297 

69.0 

1.4 

6.05 

1.47 

3.67 

0.77  ±  0.02 

297 

69.0 

1.4 

1.80 

1.47 

3.67 

0.77  ±  0.02 

297 

53.0 

1.1 

4.05 

0.82 

3.58 

0.73  ±0.05 

297 

0.76  ±0.01* 

362 

130.5 

22 

4.05 

1.19 

4.56 

1.29  ±0.03 

362 

86.7 

1.6 

6.05 

1.08 

3.99 

1.22  ±0.06 

362 

86.7 

1.6 

1.80 

1.08 

3.99 

1.21  ±0.07 

362 

70.1 

0.8 

4.05 

0.72 

2.73 

1.00  ±0.03 

362 

1.16  ±0.08* 

427 

83.6 

1.2 

4.05 

0.89 

1.43 

1.42  ±0.06 

427 

132.0 

1.9 

6.05 

1.40 

3.32 

1.60  ±0.03 

427 

132.0 

1.9 

1.80 

1.40 

3.32 

1.59  ±0.04 

427 

68.5 

0.7 

4.05 

0.62 

1.54 

1.38  ±0.02 

427 

1.45  ±0.08* 

512 

70.7 

1.2 

4.05 

1.28 

2.42 

2.06  ±0.11 

512 

110.6 

1.9 

4.05 

1.73 

2.57 

1.95  ±0.05 

512 

52.1 

0.6 

4.05 

0.82 

1.59 

1.85±0.11 

512 

50.5 

0.9 

4.05 

0.99 

2.43 

1.97  ±0.08 

512 

1.96  ±0.03* 
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621 

87.2 

1.0 

4.05 

1.13 

1.50 

2.13  ±0.09 

621 

160.8 

1.9 

4.05 

1.41 

1.70 

2.34  ±  0.05 

621 

59.4 

0.6 

4.05 

0.67 

1.07 

2.21  ±0.10 

621 

108.6 

1.8 

4.05 

1.13 

2.05 

2.27  ±  0.09 

621 

2.28  ±  0.05* 

757 

111.1 

1.3 

4.05 

1.26 

1.66 

3.00  ±0.07 

757 

60.0 

0.5 

4.05 

0.71 

0.88 

2.61  ±0.15 

757 

164.3 

1.3 

4.05 

1.47 

1.66 

2.86  ±  0.08 

757 

75.6 

0.9 

4.05 

0.83 

1.13 

2.63  ±  0.20 

757  2.89  ±  0.08* 


‘Average  value. 


TABLE  2:  Rate  Constant  Measurements  of  the  Reaction  ofH  +  CjHjL 


T,K 

P,  mbar 

F.J 

[NH3]. 

10*’  molecule  cm'* 

10”  molecule  cm'* 

^2  ^  ®k2> 

10'"  cm*  molecule''  s'* 

295 

76.5 

1.6 

4.05 

1.22 

4.62 

1.00  ±0.03 

295 

55.5 

0.9 

6.05 

0.94 

4.18 

0.92  ±  0.02 

295 

55.5 

0.9 

1.80 

0.94 

4.18 

0.85  ±  0.03 

295 

109.5 

1.7 

4.05 

0.66 

6.62 

1.06  ±0.04 

295 

0.94  ±  0.04* 

357 

131.8 

2.3 

4.05 

1.23 

3.92 

1.42  ±0.04 

357 

88.4 

1.5 

6.05 

1.20 

5.08 

1.59  ±0.02 

357 

88.4 

1.5 

1.80 

1.20 

5.08 

1.50  ±0.03 

357 

67.4 

0.8 

4.05 

0.65 

3.48 

1.36  ±0.05 

357 

1.53  ±0.05* 

450 

132.8 

1.8 

4.05 

0.86 

3.05 

1.95  ±0.05 

450 

91.3 

1.2 

6.05 

1.49 

4.15 

2.29  ±  0.03 

450 

91.3 

1.2 

1.80 

1.49 

4.15 

2.22  ±  0.03 

450 

68.0 

0.7 

4.05 

0.83 

2.89 

2.21  ±0.08 

450 

2.21  ±  0.06* 

533 

129.5 

1.8 

4.05 

0.99 

2.71 

2.61  ±  0.03 

533 

85.0 

1.0 

6.05 

0.55 

1.99 

2.65  ±  0.05 

533 

85.0 

1.0 

1.80 

0.55 

1.99 

2.55  ±  0.07 

533 

69.4 

0.6 

4.05 

0.33 

■  1.79 

2.36  ±  0.08 

533 

2.59  ±  0.04* 
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624 

155.6 

2.0 

4.05 

1.10 

3.02 

3.95  ±  0.07 

624 

96.9 

0.9 

4.05 

0.49 

1.79 

3.75  ±0.13 

624 

153.8 

1.8 

4.05 

0.70 

2.75 

3.88  ±0.16 

624 

76.6 

0.6 

4.05 

0.35 

1.30 

3.58  ±0.09 

624  3.80  ±  0.09 


‘Average  value. 
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TABLE  3:  Rate  Constant  Measurements  of  the  Reactions  of  H  +  CHjCHICHj  and  (CH3)3CHL 
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Table  4:  Ab  initio  results  for  the  transition  state  for  H  +  CH3I  -  CH4  +  L 


HF/6-3  lG(d)  frequencies* 

MP2=full/6-31G(d)  frequencies* 

MP2/6-311G(d,p)'’ 

MP4/6-311G(d,p)‘’ 

QCISD(T)/6-311G(d,p)‘’ 

MP2/6-311+G(d,p)’’ 

MP4/6-311+G(d,p)‘’ 

MP2/6-311G(2df,p)’’ 

MP4/6-311G(2df,p)‘’ 

MP2/6-311+G(3df,2p)‘’ 

G2[all-electron]'’ 


1199i,  348,410(2),  1082(2),  1187, 1541  (2),  3313, 3495(2) 
16741, 463, 470  (2),  1160  (2),  1255, 1464  (2),  3193, 3375  (2) 
-6957.17377 
-6957.21271 
-6957.21898 
-6957.17498 
-6957.21407 
-6957.23020 
-6957.27893 
-6957.25902 
-6957.31294 


•Unsealed,  in  cm'* 

“Energy  in  au;  1  au  «  2625  kJ  mol'*. 
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Table  5:  Comparison  of  iodoalkane  properties  at  room  temperature. 


Iodoalkane, 

k(298  K)  for  H  abstraction,^ 

X 

cm^  molecule'*  s'* 

CH3I 

5.6  X  10'*® 

C^HjI 

3.8  X  10'*’ 

CHjCfflCHj 

5.3  X  10'*’ 

6.8  X  10'*’ 

measured  kx(298  K),  DH29g(C-I),  kJ  mol'^ 

cm^  molecule'*  s'* 


7.7  X  10'*’ 

237 

1.0  X  10'** 

237 

1.4  X  10'** 

238 

2.0  X  10'** 

230 

“Empirical  estimate  (see  text). 


Figure  captions 


Fig.  1  Plot  of  pseudo-first-order  rate  constant  kp,i  vs  [CH3I]  at  P  =  70  mbar  and  T  =  512  K.  The 
inset  shows  the  time-resolved  fluorescence  intensity  If  for  the  filled  point. 

Fig.  2  Arrhenius  plot  of  rate  constants  kf  for  H  +  CH3I  (•)  and  kj  for  H  +  CjHjI  (o).  Each  point  is 
the  average  of  four  measurements. 

Fig.  3  Ab  initio  geometries  of  the  C3V  transition  state  for  H  +  CH3I  -  CH4  + 1.  MP2=full/6-3  lG(d) 
data  shown  (HF/6-31G(d)  in  parentheses).  Distances  are  in  lO"’®  m  and  the  ICH  angles  are 
96.6°  (95.8°). 

Fig.  4  Comparison  of  measured  ki  for  H  +  CH3I  (solid  line,  this  work)  with  ab  initio  results  (dashed 
line,  ref  3)  and  literature  values  (•,  ref  4;  ■,  ref  5;  ref  6;  0,  ref  7). 
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Abstract 


An  experimental  investigation  of  shear  localization  at  the  tip  of  a  U-notch  is  reported.  The  initiation  and 
propagation  of  shear  localization  from  the  notch  tip  in  two  aging  conditions  of  300  maraging  steel  is  recorded 
using  ultra-high-speed  photography.  The  shear  failure  susceptibility  of  the  materials  and  the  transition 
from  shear  failure  to  tensile  failure  is  discussed.  These  two  areas  are  identified  as  important  because  shear 
localization  as  a  failure  mechanism  requires,  first,  that  the  material  be  susceptible  to  such  a  localization, 
second,  that  the  localization  be  dominant  over  other  modes  of  failure  and,  last — due  to  boundary  conditions 
in  specific  problems — that  the  shear  localization  propagate  from  one  point  to  another.  In  reference  to  the 
first  topic,  the  fundamental  issue  is  whether  shear  localization  susceptibility  can  be  measured  at  all.  In  this 
work  it  is  indicated  that  peak-aged  300  maraging  steel  is  qualitatively  more  susceptible  than  under-aged  to 
shear  localization.  The  final  failure  of  the  specimen  is  characterized  through  ultra-high-speed  observation 
and  post-mortem  examination.  Propagation  is  generated  by  impacting  side  notched  plates  while  observations 
are  made  using  high  speed  photography  at  framing  rates  of  480,000  fps.  Shear  failure  is  seen  to  propagate  at 
1000  m/s  in  peak-aged  material  and  200  m/s  in  under-aged  material.  The  peak-aged  material  fails  fully  by 
shear  while  the  shear  failure  in  under-aged  material  arrests  and  is  followed  by  tensile  failure.  Finite  element 
modeling  is  used  to  determine  the  nature  of  elastic  wave  propagation  in  the  specimen. 

Limitations  of  the  notch  geometry  used  in  the  first  part  of  the  study  lead  to  the  investigation  of  shear 
localization  at  the  tip  of  V-notches  with  interest  directed  toward  the  measurement  of  shear  susceptibility. 
A  V-notch  of  opening  angle  90°  was  chosen  to  minimize  compressive  stress  on  the  surface  of  the  shear 
localization  plane  while  allowing  a  singular  stress  field  at  the  notch  tip.  A  lower  compressive  stress  on  the 
shear  plane  reduces  friction  on  the  fracture  surfaces  thereby  allowing  the  localization  to  grow  independently 
of  the  friction  parameters.  Also,  the  90°  V-notch  allows  easy  application  of  dynamic  loads  to  the  notch  faces 
making  it  easier  to  induce  plasticity  at  the  notch  tip.  Various  geometries  of  dynamic  loading  are  examined 
for  eight  different  materials;  the  majority  of  the  materials  are  high-strength  armor  materials  for  which  recent 
investigations  have  reported  the  constitutive  law  parameters.  Results  indicate  the  onset  of  shear  localization 
is  much  more  difficult  in  most  materials  than  it  is  in  maraging  steels. 
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Investigations  of  Shear  Localization  in  Energetic  Materials  Systems 


James  J.  Mason 

Preface 

Explosive  devices  for  military  applications  usually  involve  a  reactive  material  encased  in  a  metal.  Often 
that  metal  is  an  ultra-high  strength  steel  having  a  yield  strength  well  above  most  other  metals.  The  hardening 
mechanisms  in  these  steels  can  be  precipitation  strengthening  as  in  maraging  steels  or  solution  hardening  as 
in  the  tempered  martensite  steel.  In  either  case  the  hardening  mechanism  often  leads  to  reduced  ductility 
and  reduced  strain  hardening.  A  consequence  of  these  reductions  in  ductility  and  strain  hardening  is  that 
failure  by  one  of  two  mechanisms  is  probable;  tensile  fracture  enhanced  by  the  reduced  ductility  or  adiabatic 
shear  localization  nd  fracture  enhanced  by  the  reduced  strain  hardening.  A  competition  between  these  two 
failure  mechanisms  is  expected  and  experiments  investigating  such  competition  are  useful.  In  what  follows 
a  description  of  an  investigation  of  the  failure  mechanisms  of  various  ultra-high  strength  steels  and  other 
metals  will  be  described.  Focus  is  on  the  competition  between  tensile  dominated  failure  and  shear  dominated 
failure.  Shear  failure  is  seen  as  a  failure  mechanism  of  importance  because  it  can  lead  to  plug  formation  in 
the  metal  casing  of  an  explosive  device  followed  by  loading  of  the  interior  explosive  by  a  sharp-edged  punch. 
The  temperatures  in  the  metal  can  be  rather  high  due  to  shear  failure;  if  the  metal  in  an  explosive  device 
fails  by  this  mechanism  and  the  metal  subsequently  comes  into  contact  with  the  explosive,  the  explosive  may 
be  subsequently  ignited  by  the  hot  metal.  A  full  understanding  of  the  shear  failure  and  its  competition  with 
tensile  failure  makes  it  possible  to  prevent  such  ignition. 

Note  that  in  a  related  study,  Roessig  [1]  has  investigated  punch  loading  of  explosive  materials.  He 
concludes  that  early  fragmentation  of  the  explosive  prevents  the  formation  of  an  adiabatic  shear  band  in  the 
explosive  itself,  but  ignition  through  friction  between  explosive  fragments  or  between  explosive  and  metal  is 
a  probable  failure  mechanism.  Therefore,  studies  involving  friction  between  the  fragmented  explosive  and  a 
hot,  fractured  metal  may  be  important  in  understanding  the  accidental  ignition  of  military  devices  by  lower 
velocity  impact  i.e.  impact  velocity  below  the  shock  threshold  of  the  materials. 
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(a)  (b)  (c) 


Figure  1:  The  geometry  of  the  KalthofF  test  showing  three  types  of  failure;  (a)  tensile  crack  propagation  due 
to  mostly  shear  loading,  (b)  shear  crack  propagation  and  (c)  shear  crack  followed  by  a  tensile  crack. 

1  PART  I:  Shear  Failure  at  the  Tip  of  a  U-Notch 

1.1  Introduction 

A  failure  mode  transition  observed,  post  mortem,  in  side  impacted,  edge-notched  plates  [2]  has  received 
increased  attention  of  late.  A  review  of  some  recent  work  in  this  area  can  be  found  in  a  special  issue  of  the 
International  Journal  of  Plasticity  to  appear  in  1997.  Briefly,  tests  performed  on  300  maraging  steel  in  the 
KalthofF  [3]  geometry  show  a  shear  localization  or  shear  crack  forming  at  the  notch  tip  at  early  times  after 
impact  followed  by  a  tensile  crack  being  formed  at  a  later,  undetermined  time.  The  failure  mode  transition 
and  specimen  geometry  are  shown  schematically  in  Figure  l{c).  This  phenomenon  is  particularly  interesting 
because  of  the  challenges  it  presents  in  numerical  modeling  of  such  an  event — both  the  material  behavior 
and  the  geometry  determine  the  crack  path  in  a  complex  interaction  that  may  be  used  to  test  how  robustly 
a  numerical  code  can  model  dynamic  fracture — and  because,  in  one  experiment,  it  invokes  a  competition 
between  two  distinctly  different  failure  mechanisms  thus  illuminating  the  important  features  of  each. 

It  is  also  important  to  explore  the  KalthofF  test  because  it  may  serve  as  a  useful  test  for  measuring  the 
shear  localization  susceptibility  of  materials.  Shear  localization  is  widely  understood  to  occur  when  thermal 
softening  due  to  plastic  heating  occurs  at  a  greater  rate  than  strain  and  strain-rate  hardening  [4,  5,  6]. 
Usually  the  deformation  experiences  an  initial  perturbation  due  to  the  existence  of  a  material  flaw  that 
leads  to  a  shear  localization  when  thermal  softening  is  dominant.  However,  this  is  an  initiation  criterion 
and  therefore  serves  as  a  necessary  but,  perhaps,  not  sufficient  condition  for  failure  by  shear  localization. 
It  is  known  that  the  shear  localization  must  propagate  after  initiation  before  failure  occurs  [7];  hence  it  is 
important  to  investigate  the  material  resistance  to  this  propagation.  In  KalthofF  tests  shear  localization 
propagates,  as  shown  schematically  in  Figure  1(b)  in  a  controlled  fashion.  The  notch  radius,  which  can  be 
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controlled  using  wire  EDM  machining,  is  the  determining  initiation  flaw  in  the  material  and  the  propagation 
is  determined  by  the  material  characteristics,  specimen  geometry  and  the  impact  velocity. 

Of  the  work  published,  several  issues  remain  unresolved  in  reference  to  the  experimental  results  regarding 
failure  in  the  Kalthoff  test.  First,  the  nature  of  the  material  failure  in  the  shear  localization  is  not  known. 
Two  mechanisms  may  be  prevalent;  adiabatic  shear  localization  or  ductile  shear-void  nucleation  and  growth. 
Most  likely,  there  is  a  competition  between  the  two.  Attempts  by  Mason  et  al.[2]  to  look  carefully  at  the 
notch  tip  during  the  shear  failure  were  thwarted  by  the  formation  of  an  “aperture  spot”  on  high-speed- 
photography  images  due  to  unavoidable  vignetting  of  the  collimated  laser  illumination.  Second,  the  nature 
and  time  scale  of  the  failure  mode  transition  has  not  been  reported.  It  is  not  known  experimentally  whether 
the  shear  failure  arrests  before  the  mode  I  crack  forms  or  whether  the  transition  from  shear  failure  to  tensile 
failure  is  a  smooth  one.  Even  the  time  at  which  the  later  tensile  crack  forms  is  not  known.  And,  lastly, 
the  effects  of  changes  in  specimen  geometry  have  not  been  fully  explored.  It  is  reasonable  to  assume  that 
the  transition  to  a  tensile  crack  occurs  due  to  changes  in  the  crack  tip  loading  resulting  from  stress  wave 
reflections  from  the  specimen  sides,  but  little  experimental  verification  of  that  assumption  has  been  made. 

In  this  work  some  results  regarding  these  three  issues  will  be  reported.  A  high  speed  photography  system 
has  been  used  to  take  pictures  of  the  specimen  as  it  deforms  shortly  after  impact;  the  transition  from  mostly 
mode  II  to  mostly  mode  I  crack  propagation  is  also  recorded  and  velocities  of  the  cracks  may  be  measured. 
Lastly,  some  simple  finite  element  models  are  developed  to  determine  the  nature  of  the  loading  for  an  edge- 
notched,  but  uncracked,  plate  as  the  plate  geometry  is  changed  with  the  focus  being  better  determination 
of  the  best  plate  geometry  for  future  tests  of  less  shear-susceptible  materials. 

1.2  NUMERICAL  METHOD 

The  finite  element  method  was  used  to  numerically  examine  the  nature  of  the  elastic  wave  propagation  in 
some  test  geometries  as  an  extension  of  the  work  performed  by  Zimmerman  [11],  who  analyzed  the  stress 
intensity  factor  history  in  side-notched,  50  mm  x  100  mm  x  6  mm  plates  impacted  from  the  side.  Her  results 
are  shown  in  Figure  2.  It  is  hoped  that  the  same  geometry  may  be  used  on  other  materials  that  are  less 
susceptible  to  shear  localization,  but  it  is  not  known  for  sure  if  this  particular  geometry  is  ideal.  Several 
cases,  shown  in  Figure  3,  were  examined  to  determine  which  changes  in  the  specimen  geometry  might  be 
beneficial.  Case  one.  Figure  3(a),  is  that  of  an  infinite  plate.  This  case  serves  as  a  basis  for  comparison  to 
the  other  cases  since  no  reflections  are  returned  to  the  notch  tip.  Case  2,  Figure  3(b),  represents  an  infinitely 
long  double  cantilever  beam  (DCB)  type  specimen.  In  the  figure  the  beam  is  infinite  to  the  right  but  of  finite 
width  vertically.  Case  3,  Figure  3(c),  represents  and  infinitely  long  bend  specimen.  The  beam  is  infinite 
vertically  but  of  finite  width  in  the  horizontal  direction.  Lastly,  a  square  beam,  Figure  3(d),  is  analyzed  to 
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Figure  2:  The  dynamic  mode-I  and  mode-II  stress  intensity  factors,  Ki  and  Kji,  respectively,  for  the  current 
geometry  with  different  projectiles. 

demonstrate  the  effects  of  multiple  reflections. 

The  finite  element  package  ABAQUS/Standard  [8]  was  used  to  perform  the  calculations.  A  uniform 
mesh  of  eight  noded,  plane  stress,  square  elements,  5mm  x  5  mm,  was  used  to  model  the  plate  except  in 
the  area  of  the  notch  tip  where  eight  noded  quarter  point  elements  (QPE’s)  were  used  to  represent  the 
notch  tip  as  a  crack  tip.  The  material  in  the  plate  is  linearly  elastic  with  p  =  1190  kg/m^,  E  =  3.240  GPa 
and  u  =  0.35.  In  the  three  cases  that  required  infinite  dimensions,  infinite  elements  were  used  as  per  the 
ABAQUS/Standard  User’s  Manual.  The  projectile  was  modeled  as  a  rigid  body  with  an  impact  velocity 
of  30  m/s.  The  contact  surface  algorithm  included  in  the  package  was  used  to  model  the  impact.  In  some 
cases,  plane  strain  analyses  were  performed  to  examine  the  effects  of  dilatational  wave  speed.  This  helped 
determine  whether  shear  waves  or  dilatational  waves  were  dominating  the  notch  tip  behavior.  Convergence 
was  established  by  performing  analyses  on  a  coarser  mesh.  It  was  seen  that  both  meshes  gave  the  same 
result. 

Quarter  point  elements  were  used  to  model  the  notch  tip  and  measure  the  stress  intensity  factor  there.  The 
stress  intensity  factors  were  calculated  using  the  William’s  expansion  and  the  crack  tip  nodal  displacements. 
This  method  has  been  show  to  be  both  efficient  and  simple  in  its  application  to  transient  dynamic  analysis 
[10].  The  time  step  was  chosen  following  the  recommendations  of  Murti  and  Valliappan  [10]  to  reduce  the 
amount  of  spurious  oscillations  in  the  results. 

1.3  EXPERIMENTAL  METHOD 

Plates  of  300  maraging  steel  were  machined  to  into  rectangles,  50  mm  x  100  mm  x  6mm,  and  a  notch,  25 
mm  long  and  37  mm  from  the  top,  as  shown  schematically  in  Figure  1,  was  machined  through  the  thickness 
using  wire  EDM  resulting  in  a  notch  tip  radius  of  175  pm.  Two  aging  conditions  of  the  300  maraging  steel 
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Figure  3:  The  geometry  of  numerical  cases  analyzed;  (a)  the  infinite  specimen,  (b)  bend  specimens,  (c)  DCB 
and  (d)  square  specimens.  The  notch  length  is  25  mm,  the  horizontal  dimension,  when  applicable,  is  75  mm, 
the  vertical  dimension,  when  applicable,  is  also  75  mm.  The  projectile  is  rigid,  semi-infinite  and  25  mm  in 
width.  It  impacts  perpendicular  to  the  plate  with  its  upper  edge  on  the  same  line  as  the  crack. 

are  tested,  one  under-aged  and  one  peak-aged  with  aging  times  of  1/2  hour  and  4  hours,  respectively  [11,  12]. 
The  plates  are  impacted  on  the  side,  as  shown  in  Figure  1,  with  an  air  gun  and  a  25  mm  diameter,  50  mm 
long  projectile  made  of  350  maraging  steel.  Impact  velocity  is  measured  using  an  infrared  detector-emitter 
pair  mounted  on  the  gun  barrel.  High  speed  photographs  are  taken  using  a  Cordin  model  330  camera  with 
a  Cordin  model  607  light  source.  A  strain  gage  is  placed  on  the  side  of  the  plate  near  the  impact  area  to 
record  the  time  of  impact.  A  high  speed  light  sensor  is  used  to  detect  the  flash  of  the  light  source.  Both 
the  strain  gage  signal  and  light  sensor  signal  are  recorded  on  a  digital  oscilloscope  to  give  the  timing,  with 
respect  to  impact,  of  the  recorded  photographs. 

Some  comments  should  be  made  on  the  geometry  of  the  specimens  chosen.  While  Kalthoff  used  plates  with 
dimensions  of  100  mm  x  200  mm  and  a  notch  machined  50  mm  through  the  width  (the  lesser  dimension), 
the  specimens  used  here  are  smaller.  As  a  starting  point  in  this  study  the  specimens  were  chosen  to  be 
of  the  same  geometry  as  Kalthoff  but  scaled  by  one  half  so  that  the  outside  dimensions  were  reduced  to 
approximately  50  mm  x  100  mm  and  the  notch  length  became  25  mm.  This  resulted  in  a  considerable 
cost  savings  without  much  change  in  behavior.  Early  tests  by  Zimmerman  and  Mason  [11,  12]  showed  that 
this  change  in  size  did  not  affect  the  salient  features  of  the  material  behavior,  failure  mode  transition  was 
still  observed.  It  should  be  noted,  however,  that  the  analytical  advantage  with  the  Kalthoff  geometry,  in 
general,  is  that  the  loading  of  the  notch  tip  is  essentially  elastic.  For  all  reasonable  materials,  if  the  impact 
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velocity  is  large  enough  to  cause  plastic  deformation  at  the  area  of  impact,  the  simultaneously  generated 
elastic  wave  reaches  the  notch  tip  before  that  plasticity.  In  fact,  if  the  notch  length  is  large  compared  to  the 
specimen  dimensions,  as  it  is  in  the  specimen  geometry  used  here  and  elsewhere  [2,  3],  plasticity  induced  at 
impact  never  reaches  the  notch  tip  and  the  notch  tip  is  loaded  only  by  an  elastic  wave.  This  simplifies  the 
analysis  to  elastic  loading  with  small  scale  plasticity.  If  the  notch  length  is  extremely  small,  however,  plastic 
deformation  generated  at  impact  may  be  able  to  reach  the  notch  tip  and  more  complicated  analysis  may  be 
needed. 

1.4  RESULTS  AND  DISCUSSION 


1.4.1  Effects  of  specimen  geometry 


First,  the  solution  for  the  infinite  plate,  a  numerical  extension  of  the  solution  of  Lee  and  Freund  [9]  to 
long  times,  is  shown  in  Figure  4(a).  It  can  be  easily  seen  or  shown  that  Kj  is  linearly  decreasing  in  time 
and  Kjj  is  increasing  logarithmically  in  time.  Contact  between  the  projectile  and  plate  was  not  lost.  The 
normalization  constants  are  the  same  as  those  given  by  Lee  and  Freund, 
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where  E  is  Young’s  modulus,  Vg  is  the  impact  velocity,  =  1761  m/s  is  the  plane  stress  dilatational 
wave  speed,  and  I  =  25  mm  is  the  notch  length. 

Next,  the  finite  square  specimen  was  examined;  see  Figure  3(d)  for  the  geometry  and  Figure  4(d)  for  the 
calculated  stress  intensity  factors.  In  this  case  it  can  be  seen  that  the  dynamic  stress  intensity  factor  Kn 
decreases  dramatically  after  a  normalized  time  of  7.  The  behavior  is  quite  similar  to  the  results  in  Figure 
2  for  the  plates  in  this  study.  The  dashed  line  represents  the  solution  for  the  infinite  plate.  In  tests  on 
other  specimens  [2],  it  was  assumed  that  this  precipitous  drop  in  the  shear  mode  loading  at  the  notch  tip 
was  due  to  a  reflected  dilatational  wave  from  the  opposite  side  of  the  finite  specimen.  For  this  specimen  the 
dilatational  wave  first  reaches  the  crack  tip  at  t/t'  =  5,  reflects  off  the  projectile/plate  interface  and  returns 
at  t/t'  =  7,  the  approximate  time  of  the  drop  in  K[j.  Later  behavior  is  due  to  reflections  from  the  side 
wall  and  multiple  reflections  within  the  specimen.  The  negative  Ki  behaves  roughly  like  the  infinite  plate 
until  a  normalized  time  of  10  where  it  begins  to  increase  in  value  and  eventually  becomes  positive.  This  is 
a  result  of  the  induced  vibration  in  the  finite  plate.  The  projectile  lost  contact  with  the  plate  after  170  /is 
or  a  normalized  time  of  11.7. 

Next,  the  specimen  was  reconfigured  as  shown  in  Figure  3(b)  to  model  a  long  DCB  specimen.  In 
this  configuration  it  was  expected  that  the  reflected  dilatational  wave  from  the  opposite  surface  would  be 
eliminated  thereby  eliminating  the  drop  in  Ku.  The  calculated  stress  intensity  factors  are  shown  in  Figure 
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Infinite  Plate  Infinitely  Long  DCB  Specimen 


Figure  4:  The  dynamic  mode-I  and  mode-II  stress  intensity  factors,  Ki  and  Kn,  respectively,  for  (a)  a 
notched  infinite  plate,  (b)  a  notched,  infinitely  long  DCB  specimen,  (c)  a  notched,  infinitely  long  bend 
specimen  and  (d)  a  small,  square  specimen. 

4(b).  As  can  be  seen  the  drop  in  Ku  seen  in  Figure  4(d)  is  eliminated,  but  the  rate  of  increase  in  Kn  is  still 
significantly  curbed  when  compared  to  the  infinite  plate.  Calculation  of  the  time  for  shear  wave  reflected 
from  the  side  to  reach  the  crack  tip  is  t/t'  =  5.5.  This  time  is  shown  as  a  solid  vertical  line  in  the  figure,  and 
it  corresponds  with  the  divergence  of  the  solution  from  that  of  the  infinite  plate.  Dilatational  waves  reflected 
from  the  sides  arrive  much  earlier  and  have  little  effect.  The  Ki  factor  roughly  repeats  the  behavior  of  the 
infinite  plate  until  a  normalized  time  of  approximately  20-25,  after  many  multiple  reflections  have  occurred. 

Last,  the  specimen  was  reconfigured  to  be  an  infinite  bend  specimen  as  seen  if  Figure  3(c).  In  this  case 
it  was  hoped  to  eliminate  effects  of  the  shear  wave  and  isolate  the  effects  of  the  dilatational  wave  reflected 
from  the  back  surface..  The  results  of  the  calculation  are  shown  in  Figure  4(c).  It  is  easy  to  see  that  the 
drop  in  Kn  seen  in  Figure  4(d)  is  eliminated  as  it  was  in  the  DCB  specimen,  but,  once  again,  the  rate  of 
increase  in  Ku  is  reduced  to  near  zero.  A  calculation  of  the  time  for  a  dilatational  wave  to  reach  the  notch 
tip  from  the  back  surface  gives  f/t'  =  5,  at  that  time  little  change  to  the  notch  tip,  mode-II  intensity  factor 
occurs.  If,  however,  the  time  is  increased  so  that  the  wave  may  reflect  off  the  projectile/plate  interface  and 
return  to  the  notch  tip,  then  a  significant  change  occurs.  The  time  of  arrival  of  such  a  wave  is  t/t'  =  1  and 
is  shown  as  a  vertical  solid  line  in  the  figure.  At  this  time  the  Ku  diverges  firom  the  result  for  the  infinite 
plate  and  becomes  roughly  constant.  Ki  models  the  behavior  of  the  infinite  plate  closely. 
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It  is  clear  that  a  larger  specimen  is  desirable  because  the  longer  distances  give  more  time  before  reflected 
elastic  waves  reach  the  notch  tip  and  change  the  loading  there.  This  can  be  seen  in  Figure  4  where  the 
infinite  plate  shows  a  monotonically  increasing  Kjj  but  the  others  do  not.  The  longer  it  takes  for  waves  to 
reflect  back  to  the  notch  tip,  the  longer  the  duration  of  monotonically  increasing  Kn  with  a  negative  Kj. 
The  negative  Kj  is  somewhat  desirable  because  it  increases  the  compressive  hydrostatic  stress  directly  ahead 
of  the  notch  tip  thereby  suppressing  brittle  failure  modes.  The  monotonically  increasing  Ku  is  desirable 
because  it  leads  to  shear  failure;  however,  the  Kn  loading  can  also  lead  to  tensile  failure  at  an  angle  of 
approximately  70°  as  reported  by  Kalthoff  [3],  and  shown  schematically  in  Figure  1(a).  With  the  objectives 
of  generating  an  increasing  Kn  and  a  negative  Kj  at  the  notch  tip  in  mind,  the  question  of  whether  it  is 
better  to  have  a  bend  type  specimen,  like  KalthofF’s,  or  a  double  cantilever  beam  (DCB)  type  specimen 
arises.  From  this  investigation,  it  appears  that  the  bend  specimen  is  not  the  best  option  because  the  Kn 
reaches  a  constant  level.  The  DCB  is  a  bit  more  promising  because  the  Kn  is  still  increasing,  but  it  is  doing 
so  at  a  very  low  rate.  In  the  former  case,  it  is  the  reflection  of  the  dilatational  wave  from  the  opposite  surface 
to  the  projectile/plate  interface  and  back  to  the  notch  tip  that  determines  the  duration  of  the  increasing 
Kn  loading.  In  the  latter  case,  it  is  the  reflection  of  shear  waves  from  the  side  surface  that  determines  the 
duration  of  the  increasing  Kn  loading.  In  the  small  specimen.  Figure  3(d)  and  Figure  4(d),  the  combined 
arrival  of,  first,  the  shear  wave  from  the  side,  shown  as  the  solid  vertical  line  at  t/t'  =  5.5  in  the  figure,  and, 
next,  the  dilatational  wave  from  the  projectile/plate  interface  by  way  of  the  back  surface,  shown  as  the  solid 
vertical  line  at  f/t'  =  7,  lead  to  the  precipitous  drop  \n  Kn-  (This  is  a  simplified  view  of  the  process  since  we 
have  not  considered  the  effects  of  wave  reflected  first  off  the  side  then  off  the  back  and  so  on.)  Consequently, 
it  seems  that  neither  the  DCB  or  the  infinite  bend  specimens  offers  any  significant  improvement  over  the 
other.  The  strategies  of  using  momentum  traps  on  the  specimen  to  replicate  the  behavior  of  the  infinite  plate 
or  of  using  very  short  notches  and  higher  impact  velocities  seem  to  be  the  next  logical  step  in  the  evolution 
of  this  test. 

1.4.2  Observations  of  shear  localization  and  tensile  fracture 

The  results  of  impacting  two  materials  at  nominally  40  m/s  are  reported.  For  the  first  case,  under-aged  300 
maraging  material,  the  resulting  photographs  of  the  material  failure  are  shown  in  Figure  5.  At  approximately 
6  (JLS  after  impact,  normalized  time  of  1.3,  a  shear  failure  began  propagating  directly  ahead  of  the  notch. 
Examination  of  the  fracture  surfaces  after  the  test  indicates  that  this  is  indeed  a  shear  failure  in  agreement 
with  Mason  et  al.  [2]  and  Zimmerman  and  Mason  [12].  This  failure  continues  for  approximately  14.5  ^s 
and  arrests  at  a  total  normalized  time  of  4.4  when  the  upward  slope  in  in  Kn  stops  as  seen  in  Figure  2. 
During  that  growth  period  the  notch  is  closing  due  to  lateral  expansion — this  is  to  be  expected  as  shown  in 
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Figure  5:  The  dynamic  fracture  of  under-aged  300  maraging  steel.  The  vertical  black  stripe  is  light  removed  from 
the  photographs  by  the  Gordin  330  camera  for  streak  photography,  if  needed. 


the  analysis  of  Lee  an  Freund  [9]  and  Zimmerman  and  Mason  [12]— and  the  notch  faces  come  into  contact 
at  approximately  the  same  time  that  the  shear  crack  stops  growing.  It  is  not  clear  whether  the  shear  crack 
arrests  because  of  the  peak  in  Ku  or  the  contact  of  the  faces,  or  both.  Contact  is  held  for  about  27  /zs. 
After  which,  the  notch  begins  to  open,  after  a  total  normalized  time  of  10.2  corresponding  to  a  minimum 
Kj  in  Figure  2,  until  a  total  normalized  time  of  16.4,  29  fis  later,  when  K[  becomes  positive  and  a  crack 
appears  growing  at  an  angle  upward  from  the  shear  crack.  The  numerical  solution  is  considered  no  longer 
valid  after  the  crack  opens  and  effectively  changes  the  geometry  of  the  specimen.  This  new  crack  is  tensile 
and  growing  under  mixed-mode,  shear  and  tensile,  loading.  The  transition  corresponds  to  the  change  in  Kj 
from  compressive  to  tensile.  The  subsequent  arrest  of  this  secondary  crack  is  not  recorded.  Clearly,  this 
failure  is  a  two  step  process  where  each  failure  mode  is  distinct  from  the  other.  The  arrest  of  the  shear  crack, 
well  before  the  appearance  of  the  tensile  crack,  appears  to  be  caused  by  either  the  closing  of  the  notch  faces 
and  the  loss  of  a  monotonically  increasing  Ku  loading  or  both. 

In  the  second  case  a  peak-aged  300  maraging  material  was  tested  and  significantly  different  results 
were  observed.  The  photographs  of  the  material  failure,  taken  at  slightly  higher  magnification  than  in 
the  previous  case,  are  shown  in  Figure  6.  As  with  the  under-aged  material,  at  approximately  6  fis  after 
impact,  a  normalized  time  of  1.3,  a  shear  failure  began  propagating  directly  ahead  of  the  notch.  However, 
this  failure  propagates  more  rapidly  than  in  the  under-aged  material;  in  the  peak-aged  material  the  shear 
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Figure  6:  The  dynamic  fracture  of  peak-aged  300  maraging  steel. 


failure  propagates  at  approximately  1000  m/s,  a  typical  speed  for  dynamic  fracture  in  steel,  whereas  in  the 
under-aged  material  the  propagation  rate  was  much  slower,  200  m/s.  Approximately  25  /is  after  initiation, 
a  normalized  time  of  6.6,  the  shear  failure  has  traversed  the  25  mm  uncracked  section  of  specimen  and  fully 
failed  the  specimen.  Failure  is  complete  before  a  significant  drop  in  Ku  can  occur,  although  Ku  has  ceased 
to  increase,  and  contact  of  the  notch  faces  can  occur.  Again,  during  the  shear  failure  the  notch  is  closing 
due  to  lateral  expansion,  but  in  this  case  it  has  no  effect  upon  propagation  of  the  shear  failure.  Since  the 
entire  notch  is  not  visible,  it  is  not  clear  when  contact  of  the  notch  faces  initially  occurs  outside  the  field 
of  view;  presumably  it  occurs  at  the  same  time  as  in  the  under-aged  test.  But,  as  in  the  under-aged  test, 
evidence  of  the  notch  opening  is  seen  after  a  normalized  time  of  10.2  when  the  maximum  negative  Kj  is 
achieved.  After  shear  failure,  the  opening  mode  serves  to  propel  the  two  separated  pieces  away  from  each 
other.  It  appears  that  the  material  failure  and  separation  occurs  in  the  shear  mode  since  at  later  times  the 
crack  opens  in  approximately  4  /rs  which  corresponds  to  an  exceedingly  high  crack  speed  of  6000  m/s,  well 
beyond  the  theoretical  maximum  speed  or  any  experimentally  measured  speed  in  steels  [13].  The  failure  is 
a  one-step  process  dominated  by  shear  failure. 
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2  Part  II:  Shear  Failure  at  the  Tip  of  a  V-Notch 
2.1  INTRODUCTION 

The  U-notch,  a  notch  with  the  two  notch  faces  parallel  and  separated  by  the  notch  tip  diameter,  used  by 
Kalthoff  introduces  almost  pure  mode-II  loading  at  the  notch  tip  which  leads  to  a  dominant  shear  mode. 
However,  its  disadvantages  are  great:  (i)  the  notch  tip  loading  is  determined  by  the  magnitude  of  the  stress 
pulse  which  is,  in  turn,  limited  by  the  elastic  limit  of  the  material,  therefore  it  may  only  be  useful  for  very- 
high-strength,  exotic  materials  such  as  maraging  steels;  (ii)  it  requires  large  specimen  dimensions  or  some 
type  of  momentum  trap  to  remove  the  effects  of  pulse  reflections  in  finite  specimens;  (iii)  contact  of  the  notch 
faces,  as  demonstrated  in  part  I,  complicate  the  loading  behavior  and  (iv)  loads  cannot  be  easily  applied 
near  the  notch  tip.  For  these  reasons  it  was  decided  that  specimens  of  other  geometry  be  investigated. 

Specimens  with  a  V-notch,  a  notch  with  faces  inclined  at  an  angle  to  each  other  and  intercepting  at 
a  sharp  corner  of  predetermined  radius,  have  behavior  that  is  similar  to  the  U-notch;  for  many  V-notch 
angles  there  is  a  singular  stress  field,  but  some  of  the  disadvantages  of  the  U-notch  are  alleviated.  The 
notch  faces  are  further  apart  preventing  contact  of  those  faces.  Also,  the  V-notch  angle  may  be  chosen  so 
that  impact  can  occur  very  near  the  notch  tip,  and,  consequently,  plasticity  at  the  tip  may  be  more  easily 
introduced.  For  these  reasons  the  V-notch  geometry  was  investigated  here  for  several  materials  and  several 
loading  conditions. 

Seweryn  and  Molski  [14]  have  recently  outlined  the  stress  singularity  at  a  V-notch  for  various  static 
loading  and  displacement  boundary  conditions  and  for  different  opening  angles.  A  V-notch  may  be  of 
arbitrary  opening  angle,  2a,  as  shown  in  Figure  7  .  Just  like  a  stationary  U-notch,  or  a  crack,  the  stress 
singularity  for  the  V-notch  is  expected  to  be  the  same  for  the  static  and  dynamic  cases  with  the  stress 
intensity  factor  varying  in  time.  Therefore,  the  work  of  Seweryn  and  Molski  [14]  gives  a  valid  indication  of 
the  stress  singularity  at  the  tip  of  the  V-notch  under  dynamic  loading  conditions.  Like  Williams  [15]  assumed 
for  the  Airy  stress  function,  these  authors  assume  a  solution  of  the  form  r^f{6)  for  the  displacements  and 
found  a  general  solution  that  can  be  separated  into  a  symmetric  (similar  to  Mode-I  for  a  crack)  and  anti¬ 
symmetric  (similar  to  Mode-II  for  a  crack)  part.  The  application  of  stress  free  boundary  conditions  on  the 
V-notch  faces  leads  to  two  sets  of  equations  for  A  which  can  be  satisfied  only  if 

A  sin  2a  +  sin  2Aa  =  0. 

The  solutions  for  A  may  be  real  or  complex,  depending  upon  a.  When  the  exponent  is  complex,  A  =  Ai  -f  iA2, 
the  dependence  upon  r  may  be  written 

J.A  _  [(.os(A2  In  r)  -f  i  sin(A2  In  r)] 
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Figure  7:  The  geometry  and  coordinates  of  the  V-notch  specimen  with  half-angle,  a. 

so  that  only  the  real  part  of  A  contributes  to  the  singularity  in  stress.  For  the  antisymmetric  case  a  very 
similar  equation  may  be  found. 

A  sin  2a  —  sin  2\a  =  0. 

The  solutions  to  these  equations  are  reproduced  in  Figure  8.  For  stress  to  be  singular  and  displacements  to 
be  bounded  it  must  be  that  0  <  A  <  1.  The  only  meaningful  values  of  a  lie  in  the  range  0  <  a  <  180°. 
As  can  be  seen  in  the  figure,  a  V-notch  of  opening  angle  90°,  corresponding  to  a  =  135°,  has  an  exponent 
very  near  0.5  for  the  symmetric  case  and  very  near  1.0  for  the  antisymmetric  case.  Both  values  are  real; 
higher  order  terms  are  complex.  The  U-notch  or  crack,  with  a  =  180°  has  an  exponent  of  0.5  for  both  the 
symmetric  and  antisymmetric  cases.  Both  values  are  real  again;  higher  order  terms,  however,  are  real  as 
well. 

For  the  case  when  a  =  90°  it  is  seen  that  both  the  symmetric  and  anti  symmetric  parts  give  values  of 
A  >  1  which  indicates  that  no  singularity  is  present  for  stress  free  boundary  conditions  along  the  V-notch 
faces.  However,  if  impact  conditions  are  applicable  a  singularity  may  occur.  For  example,  if  a  rigid  punch 
is  statically  pressed  on  a  flat  surface,  a  =  90°,  the  pressure  distribution  under  the  punch,  p{x)  is  singular  at 
the  edges  [16] 


where  x  -  a  at  the  edge  of  the  punch  and  P  is  the  punch  load.  The  stresses  in  the  solid  are  singular 
with  respect  to  r  with  A  =  0.5  Likewise,  Fung  [17]  has  shown  that  for  arbitrary  half-angle,  q,  with  a  static 
distributed  load  p(r)  =  Pr'^  on  the  V-notch  face  the  stress  is  of  order  r'^.  So  that  if  impact  occurs  on  the 
V-notch  surface,  a  singular  stress  field  of  order  might  be  expected. 

In  all  the  cases  discussed  above  only  the  static  solutions  are  examined.  Just  like  for  a  stationary  crack,  the 
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Real  part  of  exponent  -  ReX  Real  part  of  exponent  ReX 


Figure  9:  Examples  of  other  geometries  that  lead  to  an  elastic  singularity  in  stress;  (a)  KalthofT  geometry, 
(b)  V-notch  with  stress  free  faces,  (c)  impact  of  a  half-space  and  (d)  V-notch  with  impact  on  one  of  its  faces. 

stress  singularity  for  the  V-notch  is  expected  to  be  the  same  for  the  static  and  dynamic  cases  with  the  stress 
intensity  factor  varying  in  time.  Consequently,  we  can  list  four  cases  of  dynamic  loading  in  which  elastostatic 
analysis  indicate  there  will  be  a  singularity  at  the  tip  of  a  notch;  the  Kalthoff  geometry,  any  V-notch  with 
stress-free  faces  and  a  half-angle,  a,  greater  than  90°,  a  flat  surface  under  a  punch,  and  any  V-notch  with  a 
singular  distributed  load  on  its  faces.  Examples  of  each  of  these  cases  is  illustrated  schematically  in  Figure  9. 
The  present  study  set  out  to  examine  the  dynamic  fracture  behavior  of  several  metals  under  the  conditions 
shown  in  the  figure. 

2.2  EXPERIMENTAL  METHOD 

The  experimental  method  is  simple  in  that  only  postmortem  examination  of  the  materials  was  performed. 
Since  these  tests  are  relatively  new,  it  was  not  certain  what  type  of  shear  failure,  if  any  at  all,  would  be 
observed.  The  purpose  of  the  experiments  wcis  first  to  determine  if  shear  localization  could  be  induced  using 
dynamic  loading  of  V-notches.  Two  V-notch  geometries  were  tested  along  with  the  Kalthoff  geometry  and 
a  side  impact  geometry  as  shown  in  Figure  9;  the  latter  cases  were  tested  for  comparison.  The  dimensions 
of  each  specimen  are  shown  in  Figure  10. 
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Figure  10:  Dimensions  of  specimens  tested. 


Plates  of  metal  were  machined  to  into  rectangles,  50  mm  x  100  mm  x  3mm,  and  an  appropriate  notch 
for  each  case  was  machined  through  the  thickness  using  wire  EDM  resulting  in  a  notch  tip  radius  of  175  /rm. 
The  plates  were  impacted  on  the  side,  as  shown  in  Figure  9,  with  an  air  gun  and  a  30  mm  diameter,  150 
mm  long  steel  projectile.  Impact  velocity  was  measured  using  an  infrared  detector-emitter  pair  mounted  on 
the  gun  barrel. 

Eight  different  metals  were  tested;  titanium  6-4,  1018  steel,  aluminum  6061-T6,  maraging  steel  C350, 
4340  steel,  D6AC  steel,  HP-9-4-20  steel,  300M  steel.  Of  these,  the  last  five  are  ultra-high  strength  steels;  the 
first  three  were  tested  for  comparison  with  their  properties  revealed  in  punch  studies  by  Roessig  and  Mason 

[18] .  Some  heat  treatments  of  the  ultra-high  strength  steels  are  taken  from  the  work  of  Dilmore  and  Foster 

[19] ,  for  which  the  Johnson-Cook  constitutive  law  parameters  were  evaluated.  Others  were  provided  by  the 
manufacturer  with  the  materials.  Finally,  the  titanium,  1018  steel  and  6061-T6  aluminum  were  tested  in  the 
as  received  state.  All  treatments  performed  at  Notre  Dame  are  shown  in  Figure  11 

2.3  RESULTS 

Because  of  delays  in  procuring  the  materials  and  machining  the  specimens,  only  an  initial  set  of  tests  could 
be  performed.  All  materials  and  only  three  of  the  geometries  in  Figure  9  were  tested  at  one  impact  velocity, 
42  m/s.  The  results  may  be  divided  by  specimen  geometry. 

The  Kalthoff  tests,  Figure  9(a),  gave  a  wide  range  of  behavior  depending  upon  the  materials.  For  the 
ultra-strength  steels  the  maraging  steel  failed  completely  by  shear  localization  resulting  in  two  pieces.  The 
HP-9-4-20,  4340,  D6AC  and  300M  seem  to  exhibit  behavior  more  similar  to  the  under-aged  300  maraging 
steel  in  Part  I  of  this  report.  A  small  crack  grows  straight  ahead  of  the  notch  for  a  less  than  three  millimeters 
and  is  followed  by  a  tensile  type  failure  that  exhibits  shear  lips.  The  specimen  is  partially  fractured  and 
remains  intact.  For  D6AC  and  300M  the  secondary  crack  was  large  while  for  4340  and  HP-9-4-20  the 
secondary  tensile  crack  was  either  small  or  nonexistent.  As  shown  in  the  first  part  of  this  report,  the  tensile 
failure  most  likely  occurs  at  later  times  after  impact  when  the  reflections  of  the  impact  pulse  from  the  free 
surfaces  results  in  mostly  mode-I  type  loading  at  the  notch  tip.  The  titanium  alloy  exhibited  only  a  very 
small  amount  of  deformation  at  the  notch  tip.  No  failure  could  be  seen.  The  1018  steel  and  aluminum  alloy 
exhibited  large  amounts  of  plasticity  at  the  impact  area  and  the  formation  of  a  dimple  at  the  notch  tip.  The 
dimple  appears  to  be  due  to  negative  mode-I  loading  since  the  material  in  that  area  has  expanded  above  the 
original  specimen  surface.  No  failure  could  be  seen. 

The  stress  free  V-notch  tests,  Figure  9(b),  were  carried  out  at  42  m/s  impact  velocity.  The  high  strength 
steels  showed  various  ranges  of  failure.  The  350  maraging  steel  again  fully  failed,  but  this  time  it  did  so  in 
a  tensile  mode.  The  D6AC  specimen  failure  path  was  almost  identical  to  that  of  the  maraging  steel,  but 
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the  presumably  tensile  crack  arrested  before  total  failure  occurred.  The  remaining  ultra  high  strength  steels 
showed  small  cracks  from  one  to  three  millimeters  in  length;  these  too  are  most  likely  tensile.  Again  due  to 
the  finite  size  of  the  specimen  and  the  boundary  conditions  at  the  edges,  tensile  loading  at  later  times  results 
in  fracture  of  the  metal.  The  titanium  alloy,  aluminum  alloy  and  the  1018  steel  showed  no  signs  of  failure. 
Some  dimpling  due  to  plastic  deformation  at  the  notch  tip  could  be  seen  in  the  lower  strength  aluminum 
and  steel  materials. 

Impact  on  the  notch  faces  was  achieved  as  shown  in  Figure  9  (d)  for  an  impact  velocity  of  42  m/s.  In 
these  tests  the  aluminum  alloy,  titanium  alloy  and  1018  steel,  again,  showed  no  signs  of  failure.  Surprisingly, 
however,  neither  did  any  of  the  ultra-high  strength  steels. 

Tests  on  flat  plates  were  delayed  by  errors  in  the  machining  of  specimens. 

2.4  CONCLUSIONS 

Punch  tests  and  associated  finite  element  modeling  of  such  tests  by  Roessig  and  Mason  [18]  on  the  three 
control  materials,  the  aluminum  alloy,  the  titanium  alloy  and  1018  steel,  have  indicated  that  titanium  will 
fail  by  shear  localization  in  low  clearance  punch  tests  with  a  punch  velocity  of  about  1  m/s.  The  1018  steel 
will  begin  to  show  signs  of  failure  by  shear  localization  at  punch  velocities  of  15  m/s.  And,  the  aluminum 
alloy  does  not  show  signs  of  shear  localization  in  punch  tests  up  to  15  m/s;  it  fails  by  tensile  crack  growth 
from  the  bottom  of  the  plate.  Furthermore,  Zhou  et  al.  [20]  have  reported  shear  band  growth  in  the  same 
titanium  alloy  in  Kalthoff  tests  with  specimens  twice  as  big  as  the  ones  used  here  for  the  same  impact  velocity. 
Clearly,  the  titanium  alloy  has  a  high  propensity  to  fail  by  shear  localization,  however,  in  the  tests  described 
in  part  II  of  this  report  it  showed  no  signs  of  failure  at  all,  not  to  mention  failure  by  shear  localization.  The 
key  issue  appears  to  be  the  duration  of  the  shear  loading  and  the  amount  of  accumulated  shear  strain.  In  the 
punch  tests,  and  other  test  that  show  shear  localization  of  many  materials  such  as  the  torsional  Hopkinson 
bar,  the  geometry  of  the  loading  and  support  ensures  that  a  region  of  material  will  see  mostly  shear  loading 
for  a  long  period  of  time.  In  the  tests  described  in  part  I  and  II  of  this  report  that  is  not  the  case.  The 
reflection  of  waves  from  the  stress  free  surfaces  results  in  a  change  in  the  loading  conditions  from  mostly 
shear  to  mostly  bending.  This  results  in  tensile  failure,  if  shear  failure  has  not  already  occurred.  Efforts  to 
save  costs  by  using  smaller  plates  only  exacerbate  the  problem.  Thus,  there  is  a  seemingly  insurmountable 
disadvantage  to  shear  localization  tests  using  this  geometry;  they  only  work  on  maraging  steel  or  extremel}- 
large  specimens. 

The  disadvantages  of  existing  tests  are  that  they  do  not  allow  easy  observation  of  the  shear  localization 
failure.  In  the  punch  test  the  failure  occurs  on  the  interior;  in  the  torsional  Hopkinson  bar  the  specimen  is 
small  and  the  location  of  shear  failure  initiation  is  not  known.  So,  it  is  worthwhile  to  examine  a  few  more 
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variations  on  the  specimens  shown  in  Figure  10.  Namely,  changes  in  the  boundary  conditions  will  be  made 
so  that  the  back  surface  of  the  plate  is  supported  more  like  the  punch  test.  This  will  result  in  a  longer 
duration  of  shear  loading,  and  it  is  easily  achieved  using  the  punch-die  apparatus  of  Roessig  and  Mason  [18]. 

2.5  FURTHER  WORK 

As  was  stated  in  the  results  section,  delays  were  encountered  in  the  purchasing  and  machining  of  the  materials 
for  the  specimens  in  part  II  of  this  study.  Even  though  the  contract  will  have  expired  the  work  will  be 
continued  to  its  completion.  Changes  in  the  boundary  conditions  on  the  specimens  will  be  made  to  ensure 
that  a  longer  duration  of  shear  loading  will  be  experienced  at  the  tip  of  the  notch,  whether  it  be  a  U-notch 
or  a  V-notch.  The  effects  of  those  changes  will  be  modeled  using  finite  elements.  Test  will  be  carried  out  at 
higher  velocities  and  numerical  solutions  for 
that  investigation  will  be  relayed  directly  to 


the  elastic  stress  field  at  the  notch  tip  will  be  found.  Results  of 
the  site  contact,  Dr.  J.C.  Foster,  at  Eglin  A.F.B. 
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Abstract 

The  Scalable  Coherent  Interface  (SCI)  is  a  recently  developed  IEEE  standard  that  defines  a  scalable 
high  performance  multiprocessor  network  in  which  bus-like  functionality  can  be  provided  to  a  large  number 
of  processor  nodes.  The  SCI  concept  offers  enormous  potential  improvement  in  both  performance  and  life¬ 
cycle-cost  with  regard  to  the  future  of  multiprocessor  computing.  Realizing  the  potential  benefit  of  a  SCI 
based  parallel  computer,  the  Joint  Advanced  Strike  Technology  (JAST)  program  office  has  selected  SCI 
with  real-time  extensions  (SCI/RT)  as  their  bciseline  approach  for  a  universal  data  distribution  network  for 
advanced  strike  aircraft  of  the  next  century. 

Unfortunately,  the  high  potential  offered  by  SCI,  as  it  is  currently  specified,  cannot  be  directly  exploited 
for  real-time  systems.  This  is  because  the  existing  SCI  specification  is  targeted  for  non  real-time  (i.e.,  time 
shared)  applications.  Suggestions  have  been  made  by  various  SCI  working  group  members  on  how  to  best 
extend/modify  SCI  to  support  real-time  applications  (SCI/RT).  However,  because  of  some  limitations  of 
each  of  the  proposed  candidate  SCI/RT  schemes,  progress  in  developing  a  universal  agreed  upon  SCI/RT 
standard  has  been  slow. 

In  this  document,  we  propose  a  more  efficient  and  lower  cost  alternative  SCI/RT  scheme,  called  the  job 
packing  scheme.  We  believe  that  the  implementation  of  the  job  packing  scheme  will  require  minimal  changes 
to  the  existing  SCI  baseline  protocol.  The  scheme  is  based  upon  solid  theoretical  foundation  of  generalized 
rate  monotonic  scheduling  theory  and  bin-packing  methodology  in  which  global  information  is  exchanged 
through  selected  bits  of  the  SCI  idle  symbols.  The  scheme  is  flexible  and  load  sensitive,  and  thus  can  work 
efficiently  in  a  dynamic  environment  like  SCI. 

In  this  effort  we  develop  analytical  methods  to  prove  different  properties  of  the  scheme.  A  detailed 
simulation  platform  is  built  for  its  performance  evaluation  and  comparison.  We  have  also  built  simulators 
for  some  of  the  SCI/RT  candidate  schemes  and  have  shown  the  superiority  of  the  job  packing  scheme  over 
them.  We  then  investigated  the  applicability  of  several  popular  real-time  message  scheduling  schemes  in  the 
SCI  environment.  Their  pros  and  cons  are  studied  and  evaluated  through  simulation,  and  compared  with 
the  job  packing  scheme. 


13-2 


Contents 


1  Introduction  4 

1.1  Overview  of  SCI .  5 

1.1.1  SCI  Node  Structure .  5 

1.1.2  Cache  Coherence  Protocol .  5 

1.1.3  Packet  Transportation  Protocol .  6 

1.1.4  Physical  Layer .  8 

1.2  Difficulties  in  Real-Time  Support .  8 

1.3  Our  Contributions .  10 

2  Real-time  Extensions  to  SCI  (SCI/RT)  11 

2.1  Preemptive  Priority  Queue  Protocol .  11 

2.2  Train  Protocol  .  12 

2.3  2-bit/8-bit  Priority  Protocol .  13 

3  The  Job  Packing  Algorithm  13 

3.1  Job  Admission .  15 

3.2  Job  Sequencing .  17 

4  Application  of  Popular  Real-time  Schemes  20 

4.1  Earliest  Available  First  Algorithm .  21 

4.2  Earliest  Deadline  First  Algorithm .  21 

4.3  Smallest  Slack  Time  First  Algorithm .  21 

4.4  Farthest  Away  First  Algorithm .  21 

5  Numerical  Results  and  Comparison  22 

5.1  Simulation  Model .  22 

5.2  Workload  Generation .  23 

5.3  Comparative  Study  of  Job  Packing  with  Train  and  2-bit  Protocols .  23 

5.4  Comparative  Study  of  Job  Packing  with  Popular  Real-Time  Schemes . 27 

5.5  Discussion .  29 

6  Conclusion  and  Future  Research  30 


13-3 


A  STUDY  OF  REAL-TIME  MESSAGE  TRANSMISSION  OVER 
THE  SCALABLE  COHERENT  INTERFACE  (SCI) 


Sarit  Mukherjee 


1  Introduction 

Large  scale  distributed  memory  processor  networks  or  massively  parallel  processors  (MPP)  have 
become  the  computers  of  choice  for  large  computationally  intensive  tasks  in  recent  years.  MPP 
architectures  consist  of  a  set  of  nodes  where  nodes  consist  of  processors  (s),  local  memory,  message 
router,  and  other  support  devices.  MPP  architectures  often  connect  nodes  through  direct  network 
in  which  each  node  has  a  connection  to  a  set  of  other  nodes,  called  neighbors.  Since  memory  is 
distributed,  MPP  nodes  communicate  by  sending  messages  through  the  network.  The  Scalable 
Coherent  Interface  (SCI)  is  standardized  [6,  4]  for  very  high  performance  multiprocessor  systems 
that  supports  a  coherent  shared  memory  model  scalable  to  systems  with  up  to  64K  nodes.  It 
delivers  GBytes/sec  transmission  rate  along  unidirectional  point-to-point  links  that  are  connected 
into  a  ring  topology.  SCI  wa.s  developed  by  a  working  group  of  leading  computer  researchers  who 
wished  to  overcome  the  fundamental  physical  limits  imposed  by  bus  technology.  SCI  provides  the 
services  of  a  backplane. 

Because  of  the  cost  and  performance  potential  offered  by  the  SCI  concept,  the  Joint  Advanced 
Strike  Technology  (JAST)  program  has  selected  SCI  and  its  unspecified  derivative  SCI  Real-Time 
(SCI/RT)  as  the  baseline  architecture  to  address  the  needs  of  military  aircraft  in  the  post-2005 
time  frame.  JAST  requirements  were  defined  by  several  system  studies,  including  Air  Force  PAVE 
PACE  efforts  and  the  Navy’s  Next  Generation  Computer  Resource  Program.  SCI/RT  is  intended 
to  be  an  enhancement  for  SCI  which  improves  the  real  time  and  fault  tolerance  capabilities  of  SCI. 
Currently,  the  Air  Force  and  Navy  are  jointly  involved  in  two  separate  contracts  in  which  SCI-based 
hardware  is  being  developed. 

SCI  is  an  attractive  candidate  for  real-time  communication  because  of  its  high  performance 
guarantee.  However,  the  current  version  of  the  SCI  protocol  is  not  suitable  for  real-time  applica¬ 
tions.  The  purpose  of  this  research  is  to  recommend  minimal  changes  to  the  current  standard  to 
make  it  amenable  to  real-time  message  transmission.  Before  describing  the  goals  of  our  research  in 
detail,  we  elaborate  on  the  main  difficulties  with  SCI  that  make  it  inherently  complex  for  real-time 
traffic  support.  In  the  following,  we  first  outline  a  brief  overview  of  the  SCI  protocol,  and  then 
elaborate  on  the  difficulties. 
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1.1  Overview  of  SCI 


The  SCI  is  a  new  high-speed  multiprocessor  interconnection  standard  [6,  4]  that  delivers  GBytes /sec 
transmission  rate  along  unidirectional  point-to-point  links  that  is  connected  into  a  ring  topology. 
The  protocol  includes  three  different  layers:  the  physical  layer,  the  packet  transportation  layer, 
and  the  cache  coherence  layer.  Figure  1  shows  the  layers  in  the  SCI  protocol.  Entities  on  the 


Cache  Coherence  Layer 
Packet  Transportation  Layer 
Physical  Layer 


Figure  1:  Layers  in  the  SCI  Protocol. 

cache  coherence  layer  provide  services  to  application  entities,  like  processors  and  memory  chips, 
offering  a  shared  memory  with  cache  coherency.  Entities  on  the  packet  transportation  layer  provide 
services  to  entities  on  the  cache  coherence  layer.  Services  include  transmission  of  packets  across 
the  interconnect.  The  task  of  the  physical  layer  is  to  provide  service  to  the  packet  transportation 
layer,  including  transmission  of  symbols  from  one  node-interface  to  the  next.  The  physical  layer  is 
implemented  in  a  unidirectional  point-to-point  link,  and  various  links  have  been  defined. 

This  section  will  give  an  overview  of  the  SCI  node  structure,  and  then  introduce  the  layer 
protocol.  We  will  focus  mainly  on  the  packet  transportation  layer,  since  the  real-time  extension 
will  be  applied  on  this  layer,  keeping  the  cache  coherence  and  physical  layers  unchanged. 

1.1.1  SCI  Node  Structure 

The  SCI  interface  (also  referred  to  as  SCI  node)  is  the  unit  through  which  the  compute  and  memory 
components  communicate  with  other  compute  and  memory  components  connected  to  the  ring.  Its 
logical  queueing  structure  is  identical  to  that  of  a  buffer-insertion  ring  interface  [1]  (see  figure  2). 
The  node  interface  consists  of  two  unidirectional  links  (input  and  output)  which  are  used  to  connect 
nodes  in  a  ring  topology.  The  bypass  FIFO  stores  packets  arriving  from  upstream  neighbor  while 
the  node  is  transmitting  packets.  This  enables  a  node  to  concurrently  (1)  transmit  packets,  (2) 
process  packets  addressed  to  other  nodes,  and  (3)  accept  packets  addressed  to  itself. 

1.1.2  Cache  Coherence  Protocol 

High-performance  processors  use  local  caches  to  reduce  effective  memory-access  times.  In  a  multi¬ 
processor  environment,  this  leads  to  potential  conflicts.  The  SCI  cache-coherence  protocol  defines 
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Figure  2;  SCI  interface  (also  referred  to  as  SCI  node). 


mechanisms  that  guarantee  consistent  data  even  when  data  are  locally  cached  and  modified  by 
multiple  processors.  The  SCI  cache-coherence  protocol  can  be  hardware  based,  thus  reducing  both 
the  operating  system  complexity  and  the  software  effort  to  ensure  consistency. 

SCI  uses  a  distributed  directory-based  cache-coherence  protocol.  Each  shared  line  of  memory 
is  associated  with  a  distributed  list  of  processors  sharing  that  line.  All  nodes  with  cached  copies 
participate  in  the  update  of  this  list.  Every  memory  line  that  supports  coherent  caching  has  an 
associated  directory  entry  that  includes  a  pointer  to  the  processor  at  the  head  of  the  list.  Each 
processor  cache-line  tag  includes  pointers  to  the  next  and  previous  nodes  in  the  sharing  list  for  that 
cache  line.  Thus  all  nodes  with  cached  copies  of  the  same  memory  line  are  linked  together  by  those 
pointers.  Coherence  protocols  can  be  selectively  enabled,  based  on  bits  in  the  processor’s  virtual- 
address-translation  table.  Depending  on  processor  architecture  and  application  requirements,  pages 
could  be  coherently  cached,  non-coherently  cached,  or  not  cached  at  all. 

1.1.3  Packet  Transportation  Protocol 

The  key  idea  behind  the  SCI  packet  transportation  protocol  is  the  use  of  unidirectional,  point- 
to-point  links  that  can  be  clocked  at  a  rate  independent  of  the  signal  latency  between  nodes. 
Each  node-interface  has  one  input-link  and  one  output-link.  The  basic  logical  structure  of  the 
interconnect  is  a  ring.  By  using  switches,  which  is  a  special  node  with  more  than  one  node¬ 
interface,  multiple  rings  can  be  connected  and  various  topologies  can  be  formed.  Since  the  links 
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are  unidirectional,  all  information  in  a  ring  move  in  the  same  direction. 

The  interfaces  communicate  by  exchanging  packets,  which  are  finite  sequences  of  symbols.  A 
symbol  is  16  bits  and  the  smallest  information  fragment  transmitted  between  node-interfaces.  A 
link  transmits  one  symbol  at  a  time.  Because  SCI  protocols  are  synchronous,  special  idle  symbols 
are  transmitted  across  the  links  in  the  absence  of  packets,  and  at  least  one  idle  symbol  is  always 
sent  between  consecutive  packets.  Although  SCI  uses  idle  symbols  for  a  variety  of  purposes,  they 
are  of  key  importance  in  its  flow  control  protocol  [13],  which  prevents  node  starvation  and  fairly 
allocates  bandwidth  to  all  nodes  on  the  ring. 

There  are  two  main  types  of  packets,  send  packets  and  echo  packets.  A  send  packet  carries 
information  generated  by  the  higher  layer  to  the  destination  node.  An  echo  packet  is  returned  to 
the  source  by  the  destination  as  an  acknowledgment.  The  send  packets  can  be  divided  into  two 
sub-types,  request  send  and  response  send  packets,  corresponding  to  the  requests  and  responses 
generated  at  a  higher  layer,  respectively.  Similar  distinctions  can  be  made  for  echo  packets  as  well. 

When  a  node  wants  to  send  a  packet,  it  places  the  packet  in  its  output  FIFO.  If  the  bypass 
buffer  is  empty,  and  the  node  is  not  currently  transmitting  a  packet  from  the  address  decoder,  the 
send  packet  may  be  immediately  output  onto  the  ring.  If  bypass  buffer  is  not  empty,  or  the  node  is 
currently  busy  in  transmitting  a  packet,  the  send  packet  must  wait  in  the  output  FIFO.  When  the 
packet  is  transmitted,  a  copy  of  that  packet  must  be  saved  into  an  optional  active  buffer.  The  copy 
is  either  discarded  or  used  for  retransmission  when  the  echo  packet  for  the  send  packet  is  received. 

A  packet  is  transmitted  symbol  by  symbol  to  the  downstream  neighbor.  When  a  packet  arrives 
at  a  node,  the  destination  address  field  (target  ID)  in  the  packet  header  will  be  checked.  If  the 
address  does  not  match  the  address  of  the  current  node,  it  will  be  passed  on  to  the  next  node.  This 
will  go  on  until  the  packet  reaches  the  destination  node  and  will  there  be  stripped  from  the  ring. 

When  a  packet  is  to  be  passed  but  the  output  FIFO  (source  queue)  at  that  node  is  currently 
transmitting  a  packet,  or  the  bypass  FIFO  is  not  empty,  the  passing  packet  is  routed  into  the  bypass 
FIFO  instead.  If  a  passing  packet  and  an  output  (source)  packet  are  ready  for  transmission  at  the 
same  time,  the  output  queue  (source  queue)  is  given  priority  and  the  passing  packet  is  routed  to 
the  bypass  queue.  When  the  output  queue  (source  queue)  is  done  transmitting,  if  the  bypass  FIFO 
has  accumulated  any  symbols,  output  resumes  from  the  bypass  FIFO.  This  is  called  recovery  stage 
and  lasts  until  the  bypass  FIFO  is  completely  emptied.  The  node  is  not  allowed  to  transmit  any 
output  packet  during  the  recovery  stage  —  the  output  packets  wait  in  the  output  FIFO. 

When  the  packet  arrives  at  the  target  node,  if  the  input  FIFO  is  not  full,  the  receiving  packet  is 
striped  and  placed  into  the  input  FIFO.  Otherwise  the  receiving  packet  is  discarded  due  to  buffer 
overflow.  The  last  four  symbols  of  the  send  packet  are  replaced  with  an  echo  packet  that  continues 
its  way  around  the  ring  to  the  packet’s  source. 

When  the  echo  packet  reaches  the  source  node,  it  is  matched  with  the  saved  copy  of  the  send 
packet.  The  saved  packet  is  discarded  if  the  transmission  was  successful.  Otherwise  it  is  retrans- 
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mitted. 


1.1.4  Physical  Layer 

The  task  of  the  physical  layer  is  to  provide  a  unidirectional  point-to-point  link  to  transmit  symbols 
from  one  node  to  the  next.  SCI  protocol  defines  various  links,  including  Parallel  electrical  link 
operating  at  1  Gbyte/sec  and  used  over  short  distances  (meters);  Serial  optical  link  operating  at  1 
Gbit/sec  and  used  over  longer  distances  (kilometers);  Serial  electrical  link  operating  at  1  Gbit/sec 
and  used  over  .intermediate  distances  (Tens  of  meters). 

1.2  Difficulties  in  Real-Time  Support 

The  SCI  defines  an  interconnect  system  that  scales  well  as  the  number  of  attached  processors 
increases,  that  provides  a  coherent  memory  system,  and  that  defines  a  simple  interface.  As  the  SCI 
was  intended  for  time-shared  applications  (i.e.,  non  real-time),  the  SCI  designers  were  concerned 
with  the  optimization  and  efficient  implementation  to  achieve  low  average  response  time,  high 
average  throughput,  and  fairness  in  bandwidth  utilization. 

However,  real-time  systems  require  guarantees  on  when  certain  tasks  complete.  The  notion  of 
a  deadline  is  used  to  measure  the  timeliness  of  task  completion,  that  is,  if  a  task  completes  before 
its  deadline,  it  is  on  time.  The  correctness  of  the  system  depends  on  meeting  the  deadlines  of  the 
tasks  [18].  Therefore,  guaranteed  timing  behavior  (i.e.,  guaranteed  latency)  is  the  essential  metric 
for  real-time  systems.  If  the  system  activities  are  schedulable,  then  all  requests  will  be  serviced.  For 
this  reason,  fairness  and  guaranteed  forward  progress  are  seldom  of  concern  in  real-time  systems. 

Unfortunately,  the  SCI  protocol,  as  it  stands  today,  cannot  be  applied  to  real-time  systems. 
This  is  because  the  SCI  protocol  ensures  forward  progress  but  not  deterministic  latency.  Thus,  the 
fundamental  problem  is  to  modify  the  SCI  protocol  from  one  which  guarantees  forward  progress  to 
a  SCI/RT  protocol  which  guarantees  latency. 

The  problem  of  obtaining  guarantees  on  latency  with  a  distributed  network  such  as  SCI  is 
inherently  complex^ .  The  complexity  arises  mainly  because  of  the  buffer  insertion  feature  of  SCI. 
This  is  explained  with  the  help  of  figure  3.  Consider  a  SCI  ringlet  with  three  nodes  as  shown, 
one  intermediate  node  in  between  the  source  and  the  target.  We  are  concerned  with  providing 
deterministic  latency  between  the  source  and  the  target.  The  latency  can  be  divided  into  two 
components;  (1)  waiting  time  at  the  source,  which  is  the  local  component,  and  (2)  the  ring  transfer 
time^  between  the  source  and  the  target,  which  is  the  distributed  component.  The  most  critical 
component  of  latency  in  a  distributed  environment  is  the  ring  transfer  time.  The  ring  transfer  time 
in  SCI  consists  of  three  components,  namely  the  transmission  time  of  the  packet  by  the  source,  the 

'Real-time  solutions  exist  for  similar  ring  protocols,  e.g.,  slotted  ring  [12],  token  ring  such  as  FDDI  [14],  etc. 

^It  is  the  time  between  the  transmission  of  the  first  bit  of  a  packet  from  the  source  to  the  ring  and  the  reception 
of  the  last  bit  of  the  packet  by  the  target  from  the  ring. 
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Figure  3:  SCI  ring  with  and  without  a  busy  intermediate  node  between  source  and  target. 

propagation  delay  from  source  to  target  and  buffering  delay  (in  bypass  FIFO)  in  the  intermediate 
nodes.  Both  transmission  time  and  propagation  delay^  are  fixed  for  a  given  SCI  network.  The 
only  variable  component  is  the  buffering  delay  at  the  intermediate  nodes.  As  this  example  will 
elaborate,  this  component  of  delay  makes  the  real-time  message  delivery  over  SCI  ring  inherently 
complex. 

Consider  figure  3(a).  When  the  intermediate  node  is  idle  (i.e.,  not  transmitting),  then  the 
source  to  target  transmission  traces  the  path  indicated  by  the  dashed  line  and  reaches  the  target 
after  the  transmission  time  and  the  propagation  delay.  This  results  in  deterministic  ring  transfer 
time.  However,  if  the  intermediate  node  is  busy  (i.e.,  transmitting),  the  source  to  target  packet  is 
buffered  in  the  bypass  FIFO  of  the  intermediate  node  (see  the  two  part  dashed  line  in  figure  3(b)). 
It  is  forwarded  to  the  target  after  the  packet  transmission  from  intermediate  node  is  completed. 
Thus  the  ring  transfer  time  between  the  source  and  the  target  becomes  a  function  of  the  load  at 
the  intermediate  node  and  moreover,  it  is  a  function  of  the  length  of  the  packet  that  contend  for 
concurrent  transmission  at  the  intermediate  node.  This  results  in  the  non-determinism  in  ring 
transfer  time  in  SCI. 

Protocols  similar  to  SCI,  e.g.,  slotted  ring  or  token  ring  [1]  do  not  sufifer  from  this  problem, 
i.e.,  the  intermediate  nodes  do  not  introduce  any  additional  buffering  delay  (other  than  the  one  for 

^Since  the  distance  between  the  source  and  the  target  is  fixed,  and  address  decoding  time  at  an  intermediate  node 
is  constant,  therefore,  source  to  target  propagation  delay  is  fixed. 
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address  decoding  which  is  constant  and  can  be  included  in  the  propagation  delay).  Thus,  the  ring 
transfer  time  in  these  protocols  is  fixed.  Therefore,  in  order  to  guarantee  latency  in  these  protocols, 
determination  of  start  time  for  transmission  of  a  packet  from  the  source  is  sufficient.  However,  as 
the  above  example  elaborates,  the  same  strategy  cannot  be  applied  to  SCI/RT. 

There  are  several  other  difficulties  with  the  SCI  protocol  regarding  real-time  traffic  support. 
These  difficulties  include  the  FIFO  queueing  discipline,  insufficient  number  of  priority  bits,  etc,  that 
are  independently  identified  by  other  researchers  as  well  [5,  7].  We  do  not  elaborate  on  these  in 
this  document,  since  the  solution  to  the  non-deterministic  ring  transfer  time  encompasses  solutions 
to  them  as  well. 

1.3  Our  Contributions 

In  this  report  we  discuss  our  studies  on  the  development  on  and  experimentation  with  real-time 
protocols  for  SCI.  The  objective  of  the  research  is  to  develop  a  theoretical  foundation  for  the 
scheduling  of  real-time  messages  using  the  SCI.  An  underlying  goal  is  to  keep  the  differences  between 
the  SCI  and  SCI/RT  protocols  to  an  absolute  minimum.  This  will  allow  the  possibility  that  the 
future  chip-sets  can  support  both  protocols  or  at  least  pin-for-pin  compatible  versions  will  be 
available.  To  this  effect,  at  the  time  of  writing  this  report,  we  have  achieved  the  following: 

•  We  have  established  a  theoretical  foundation  under  which  various  SCI/RT  protocol  options 
can  be  analyzed.  We  have  designed  and  experimented  with  workloads  which  are  representative 
of  the  tasks  that  a  SCI/RT  would  be  subjected  to,  and  have  shown  how  such  a  workload  can 
be  synthetically  generated. 

•  We  have  created  a  simulation  platform  for  the  two  most  popular  SCI/RT  candidate  schemes 
already  proposed  (they  are  briefly  described  in  section  2).  Through  simulation  and  analysis 
of  the  results  we  have  outlined  some  of  the  shortcomings  of  these  schemes. 

•  We  have  proposed  a  novel  real-time  message  scheduling  scheme  over  SCI  called  the  job  packing 
algorithm.  The  method,  based  on  generalized  rate  monotonic  theory  [17],  is  a  distributed 
implementation  of  a  real-time  protocol.  The  scheme  consists  of  two  stages:  job  admission 
and  job  scheduling.  The  first  stage  can  be  computed  centrally  by  the  node  where  the  job 
arrives.  The  second  stage  is  carried  out  in  a  distributed  fashion  only  for  the  admitted  jobs. 
In  this  way,  the  computational  overhead  is  reduced  drastically,  and  at  the  same  time  delivery 
is  guaranteed. 

•  We  have  built  a  detailed  simulation  platform  for  the  job  packing  algorithm.  The  central¬ 
ized  and  the  distributed  components  of  the  scheme  are  simulated  and  evaluated  through  the 
synthetic  workload. 
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•  We  have  adopted  and  studied  several  existing  popular  real-time  message  scheduling  techniques 
for  SCI.  This,  we  believe,  constitutes  several  other  options  towards  the  realization  of  SCI/RT. 
A  generalized  simulation  platform  is  developed  to  examine  their  performance  and  feasibility. 

•  Last  but  not  the  least,  using  the  simulators  and  the  synthetic  workload,  we  have  done  a 
detailed  comparison  of  the  different  SCI/RT  schemes.  They  include  the  candidate  SCI/RT 
schemes,  our  job  packing  scheme  and  the  other  adopted  schemes.  The  results,  described  in 
section  5  show  the  relative  performance  of  different  schemes.  It  also  establishes  the  superiority 
of  the  job  packing  scheme. 

The  rest  of  the  report  is  organized  as  follows.  In  section  2  we  describe  the  candidate  SCI/RT 
schemes  and  point  out  their  shortcomings.  In  section  3  we  describe  our  job  packing  algorithm 
in  detail.  Several  other  real-time  message  scheduling  techniques  and  their  adoptions  are  detailed 
in  section  4.  Details  of  the  synthetic  workload  generation  and  simulation  experimentations  and 
comparisons  are  presented  in  section  5.  The  report  is  concluded  in  section  6  and  future  research 
directions  are  pointed  out. 

2  Real-time  Extensions  to  SCI  (SCI/RT) 

When  the  SCI  standard  was  awaiting  approval,  interest  had  already  grown  in  using  the  SCI  pro¬ 
tocol  in  real-time  environment.  This  activity  branched  off  into  the  SCI/RT  working  group  (IEEE 
P1596.6)  and  work  has  progressed  since  then.  The  goal  of  SCI/RT  working  group  is  to  modify  the 
existing  SCI  protocol  for  real-time  purposes.  Compared  to  time-shared  systems  like  SCI,  real-time 
systems  have  additional  requirements  which  affect  the  design  process.  The  real-time  scheduling 
requires  the  interconnect  to  have  worst  case  latency  guarantees,  worst  case  bandwidth  availability 
guarantees,  and  the  ability  to  insure  that  tasks  running  in  the  background  using  excess  bandwidth 
do  not  interfere  with  those  that  are  currently  scheduled  to  receive  the  available  bandwidth.  To 
meet  these  requirements,  several  modifications  to  the  SCI  protocol  have  been  proposed.  In  this 
section  we  describe  three  major  SCI/RT  proposals:  Preemptive  Priority  Queue  Protocol  [2],  Train 
Protocol  [16],  and  2-bit/8-bit  Protocol  [7],  . 

2.1  Preemptive  Priority  Queue  Protocol 

The  preemptive  priority  queue  protocol  [2]  is  designed  to  work  at  the  network  (ring-local  subaction) 
protocol  level,  leaving  the  inter-network  (end-to-end  transaction)  and  cache-coherency  protocol 
levels  unchanged,  and  thus  allowing  full  interoperability  with  SCI  through  switches  or  bridges. 
It  views  the  SCI/RT  system  as  a  queueing  network.  There  are  three  basic  components  in  the 
queueing  network:  host  input  queue,  link  input  queue,  and  bypass  queue.  It  proposes  to  modify 
them  as  preemptive  priority  queues.  The  echo  waiting  queue  and  the  response  waiting  queue 
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are  made  content  addressable  for  efficiency.  The  amount  of  space  in  the  bypass  queue  needed  to 
absorb  an  arriving  packet  equals  the  amount  of  the  packet  remaining  to  be  transmitted  from  the 
host  input  queue.  Therefore,  nodes  should  preempt  only  the  minimum  number  of  lower-priority 
packets,  necessary  to  ensure  completion  of  transmission  of  the  packet  from  the  host  input  queue, 
as  determined  at  the  time  of  packet’s  arrival. 

By  allowing  preemption  it  can  support  rate-monotonically  scheduled  message  transmission  ef¬ 
ficiently  over  the  network.  For  details  of  the  protocol  refer  to  [2].  This  proposal  was  not  accepted 
because  it  is  very  expensive  to  implement,  and  it  deviates  significantly  from  the  original  SCI  spec¬ 
ifications. 

2.2  Train  Protocol 

The  train  protocol  [16]  is  a  token  based  scheme.  It  is  an  alternative  approach  to  SCI/RT  compared 
to  the  approaches  of  modifying  the  SCI  protocol  to  create  a  system  obeying  the  priority  based 
scheduling  theories,  such  as  the  preemptive  priority  queue  protocol  [2]. 

The  train  protocol  uses  a  special  token,  the  LocalMotive  that  circulates  around  the  ring,  carrying 
priority  information.  Train  protocol  provides  8  bits  of  priority  field,  i.e.,  256  priority  levels,  to 
differentiate  the  order  in  which  tasks  should  complete.  This  guarantees  that  the  lower  priority 
tasks  will  not  interfere  with  tasks  of  higher  priorities.  A  train  is  sent  around  the  interconnect 
to  determine  which  packets  should  be  sent  and  which  should  be  saved  for  later  transmission  to 
avoid  interference  with  higher  priority  traffic.  The  train  guarantees  the  most  efficient  use  of  the 
interconnect  based  on  the  priority  level,  while  at  the  same  time  guarantees  that  no  bandwidth  is 
unused  on  the  interconnect.  Train  protocol  also  provides  a  fast  transmit  mechanism  to  reduce  the 
latency  of  transmission  during  times  of  low  activity  in  the  interconnect. 

The  train  is  made  up  of  a  LocalMotive  followed  by  tickets.  The  LocalMotive  carries  information 
about  the  train  structure,  leads  the  train  circulating  around  the  ring,  and  provides  the  nodes  with 
information  needed  to  make  a  transmission  decision.  The  tickets  are  placed  after  the  LocalMotive 
by  a  node  wishing  to  transmit,  and  when  returned  to  the  node  may  grant  permission  to  transmit  a 
packet.  The  packet  which  had  received  permission  and  the  other  tickets  will  follow  the  LocalMotive 
circulating  on  the  ring.  A  ticket  contains  all  the  information  about  the  packet  that  the  node  desires 
to  send.  The  ticket  also  provides  the  negotiation  mechanism  by  which  the  nodes  determine  which 
packets  are  to  be  sent. 

There  is  a  designated  TicketMaster  node  in  the  ring  that  performs  the  central  services  for  the 
train  protocol.  This  node  is  responsible  for  generating  and  removing  the  LocalMotive  from  the 
interconnect  as  well  as  marking  the  train  that  has  had  the  ticket  attached. 
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2.3  2-bit/8-bit  Priority  Protocol 


The  2-bit/8-bit  priority  protocol  [7]  is  a  hybrid  protocol  scheme.  The  2-bit  priority  protocol  is 
simpler  to  implement  and  appears  to  be  sufficient  for  typical  personal-computer  and  workstation 
applications.  The  8-bit  priority  protocol  implementation  is  more  complex,  but  provides  a  relatively 
complete  set  of  protocols  for  implementing  hardware-based  rate-monotonic  scheduling  [17]  between 
limited  priority  levels  per  node.  The  8-bit  priority  protocol  is  very  similar  to  the  train  protocol. 
Therefore,  in  this  section,  we  describe  the  2-bit  priority  protocol  only. 

The  major  goal  of  2-bit  priority  scheme  is  to  define  an  efficient  mechanism  for  transmitting 
prioritized  packets  over  SCI,  to  support  real-time  personal-computer/workstation  applications  by 
providing  superior  bandwidth  and  latencies  for  prioritized  transactions.  It  defines  four  priority 
levels  defined  among  two  classes  as  shown  in  table  1. 


Class 

Subclass 

Level 

Usage 

Bandwidth  Share 

Unfair 

unfairHi 

3 

emergency 

3/4  of  total 

unfairLo 

2 

normal 

ring  bandwidth 

Fair 

fairHi 

1 

emergency 

1/4  of  total 

fairLo 

0 

normal 

ring  bandwidth 

Table  1:  Priority  and  class  definitions  of  2-bit  protocol. 


The  unfair  and  fair  classes  are  allocated  different  portions  of  the  ring  bandwidth  depending 
on  the  load  of  the  unfair  traffic.  If  there  is  no  unfair  traffic,  the  full  ring  bandwidth  can  be  used 
by  the  fair  class.  In  the  presence  of  unfair  traffic,  most  of  the  bandwidth  is  allocated  to  them, 
the  residual  being  used  for  the  fair  class.  The  high  and  low  subclasses  within  each  class  provide 
finer  mechanism  to  reduce  the  latency  of  emergency  messages.  The  2-bit  protocol  (re)  defines  the 
idle  symbol  to  carry  different  priority  and  node-status  information  to  implement  the  bandwidth 
allocation.  This  protocol  is  too  simple  (few  priority  levels)  to  handle  a  number  of  concurrent 
real-time  message  transfer  sessions. 

3  The  Job  Packing  Algorithm 

In  this  section  we  outline  our  algorithm  to  conduct  real-time  job  scheduling  in  a  SCI  ring.  The 
algorithm  is  based  on  generalized  rate-monotonic  scheduling  theory  (GRMS)  [17].  However,  unlike 
GRMS,  our  scheme  is  a  distributed  one  that  is  suitable  for  an  environment  like  SCI.  The  proposed 
algorithm  performs  two  essential  functions:  (1)  job  admission  and  (2)  job  sequencing.  The  job 
admission  algorithm,  running  at  each  node,  decides  whether  a  new  job  can  be  admitted  or  not, 
so  that  its  messages  can  be  delivered  within  the  deadline  without  violating  the  deadline  of  the 
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previously  accepted  jobs.  Once  a  job  is  accepted,  the  next  step  is  to  sequence  the  job  in  the  nodes’ 
transmission  “calendar”,  i.e.,  to  determine  when  the  job  should  be  transmitted  with  respect  to  the 
existing  schedule  of  jobs. 

The  intuition  behind  dividing  the  scheduling  functionalities  in  two  parts  is  to  achieve  efficiency. 
The  GRMS,  although  a  powerful  technique,  is  applicable  for  a  centralized  environment.  An  SCI 
ring  is  inherently  a  distributed  system  and  the  application  of  GRMS  is  not  straight  forward.  In  our 
algorithm,  we  use  this  powerful  technique  in  such  a  way  so  that  it  becomes  amenable  for  distributed 
treatment.  To  achieve  our  goal,  we  perform  the  job  admission  locally,  i.e.,  a  node  need  not  consult 
other  nodes  in  order  to  admit  a  new  job.  This  is  achieved  by  keeping  sufficient  information  per 
node  about  the  global  scheduling  behavior.  Note  that  the  easiest  and  naive  way  of  achieving  this 
would  be  to  replicate  the  schedule  of  all  the  nodes  in  each  node.  However,  this  is  a  huge  information 
and  very  expensive  to  maintain  as  well.  We  keep  minimal  global  information  to  perform  this  task. 
In  the  job  sequencing  phase  the  neighboring  nodes  exchange  information  and  update  their  job 
sequences  using  a  derivative  of  GRMS  technique  developed  by  us.  This  update  information  makes 
one  complete  round  through  the  ring  to  let  every  node  know  and  collect  the  necessary  information. 
Thus,  global  information  is  exchanged  only  when  a  job  is  accepted,  not  otherwise.  Moreover,  the 
information  kept  per  node  is  sufficient  to  make  sure  that  a  local  decision  about  a  job  admission 
will  always  be  accepted  by  all  the  nodes  in  the  ring,  if  they  were  to  take  part  in  the  job  admission 
decision. 

We  assume  that  a  job  consists  of  a  set  of  real-time  messages.  For  ease  of  exposition,  in  the 
rest  we  will  assume  that  the  size  of  the  set  is  unity,  and  will  use  the  terms  job  and  message 
interchangeably.  A  real-time  message  M  is  defined  as  a  five  tuple,  M  =  {P,D,C,S,T},  where 
P,  D,  C,  are,  respectively,  the  period,  deadline  and  transmission  time  of  the  message  transmitted 
by  source  S  to  target  T.  The  message  is  periodically  generated  at  source  S  after  every  P  time 
units.  It  is  ready  for  transmission  at  the  beginning  of  the  period  and  it  has  to  be  received  within 
the  deadline.  We  assume  that  P  =  P  in  the  rest  of  the  discussion.  The  message  transmission 
requires  C  time  units.  Notice  that  a  job  defined  this  way  succinctly  models  real-time  process 
control  messages  such  as  messages  generated  by  a  sensor  periodically,  real-time  animation,  etc. 
Messages  that  do  not  repeat,  in  other  words,  aperiodic  messages  such  as  an  interrupts  can  also  be 
modeled  in  our  framework  by  assuming  that  the  message  does  not  repeat. 

In  the  rest  of  this  section  we  describe  the  job  admission  and  sequencing  schemes.  At  the  time 
of  writing  this  report,  all  the  necessary  concepts  regarding  the  algorithms  were  developed.  Some  of 
the  proofs  of  the  algorithms  need  some  more  work  and  will  be  reported  in  a  future  paper.  Below 
we  explain  the  main  concept  behind  the  scheme  and  elaborate  it  through  instructive  and  detail 
examples.  We  conduct  a  performance  comparison  of  our  scheme  with  the  other  proposed  SCI/RT 
schemes  in  section  5. 
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3.1  Job  Admission 


As  mentioned  before  the  job  admission  decision  can  be  carried  out  locally  at  the  arriving  node. 
The  necessary  global  information  is  kept  in  the  node’s  data  structures.  Once  a  job  is  admitted, 
those  data  structures  are  updated. 

In  order  to  check  if  a  job  can  be  admitted  into  the  system,  the  admission  controller  ensures 
that  the  utilization‘s  of  the  node  remains  below  1  and  all  the  local  jobs  meet  their  deadlines.  To 
compute  the  utilization  of  node  s,  we  only  consider  the  messages  that  originate  or  pass  through 
the  node.  Let  denote  such  a  message  set.  Then  the  utilization  of  node  s,  denotes  as  Ug,  can  be 
defined  as 

U.^  E 

3  ^ 

If  the  newly  arrived  job  makes  Ug  higher  than  1,  the  job  is  rejected  straight  away. 

Next  the  deadline  of  the  newly  arrived  job  is  considered.  The  end-to-end  deadline  (i.e.,  the 
time  within  which  the  message  has  to  reach  its  destination)  can  be  broken  down  into  two  parts: 
transmission  delay  and  propagation  delay.  Transmission  delay,  in  turn,  consists  of  two  parts,  namely 
transfer  time  (C),  and  waiting  time  at  intermediate  nodes,  for  higher  priority  job  transmission.  The 
computation  of  the  propagation  delay  and  waiting  time  are  explained  below: 

As  mentioned  before,  each  node  keeps  a  job  sequence,  and  the  newly  admitted  job  has  to  fit  in 
the  sequence  without  violating  any  timing  constraints.  The  job  sequence  for  node  s  is  defined  as 
Ti,  72, . . .  ,Ti, . . .  ,Tj, . . .  ,Tn,  where  the  priority  of  job  n  is  higher  than  Tj  if  i  <  j.  Note  that  we  have 
omitted  the  node  subscript  since  there  is  no  confusion.  Node  subscript  will  be  used  if  there  is  any 
confusion.  The  minimum  delay  between  two  successive  transmissions  of  t,  is  a  function  of  the  job 
sequence  [17],  and  we  denote  it  by  /Ti(ri,T2, . . .  ,t„).  The  function  can  be  evaluated  as: 


fti  (71 , 72 ,  .  .  .  ,  7,j) 


1-E 


:1  Pj 


The  waiting  time  of  the  message  7j  will  be  /Tj(7i,  72, . . . ,  Tn-i,  Tn).  So  we  have: 


Transmission  delay  =  Cj  -b  /r;  (71, 72, . . . ,  7„). 


The  propagation  delay  is  contributed  by  the  link  propagation  delay  and  the  block  time  at  the 
bypass  buffer  of  the  intermediate  node(s).  So  we  define. 

Propagation  delay  =  E  +  B, 

where  E  is  the  total  link  delay  on  the  network  from  the  source  to  the  destination,  B  is  the  summation 
of  the  block  time  at  the  intermediate  node(s).  For  a  given  source-destination  pair,  E  is  fixed  and 

^Utilization  of  a  job  is  defined  as  C/P. 
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can  be  ignored  by  subtracting  E  from  P.  B  can  be  computed  as: 

V  Intermediate  Node 

S=  E  IrM), 

where  t  is  the  set  of  messages  that  are  scheduled  before  the  message  Tj  at  an  intermediate  node. 
The  utilization  17/  of  each  message  can  be  revised  with  regard  to  the  block  time  as: 

jjt  _  Cj  +  Bj 

In  the  admission  control  procedure  we  try  to  keep  17/  <  1  at  node  s.  In  the  following  we  elaborate 
the  job  admission  process  through  an  example. 


Example:  Consider  an  SCI  ringlet  consisting  of  three  nodes  (as  shown  in  figure  4).  The  job 
arrivals  at  each  node  are  shown  in  the  following  table. 


Node 

Job 

C 

D 

P 

Node  1 

Tn 

1 

9 

9 

TU 

1 

10 

10 

Node  2 

T21 

1 

5 

5 

T22 

1 

6 

6 

Node  3 

T31 

2 

7 

7 

T32 

1 

9 

9 

Figure  4:  A  three  node  SCI  ring. 


The  utilization  of  each  node  is: 


172  -  Lj 

TT  -  Ci 

173  —  p. 


9  +  10  +  7  +  9—  0.61  <  1, 

5  +  5  +  5  +  ^=  0.58  <  1, 

i+i  +  2  +  1  ^0.77<1. 
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So  the  first  condition  is  satisfied.  If  we  use  the  earliest  deadline  first  scheduling,  we  get 
Node  1  Node  2  Node  3 

-B(ni)  =  0  +  60/19  =  3.16  B(r2i)  =0  +  0  =  0  S(t3i)  =0  +  0  =  0 

t/(Tii)  =  (l  +  3.16)/9<l  17(r2i)  =  (l  +  0)/5<l  [/(rgi)  =  (2  +  0)/7  <  1 

B(ti2)  =  9/8  +  270/47  =  6.87  B[t22)  -  5/4  +  5/4  =  2.5  S(t32)  =  14/5+189/38  =  4.97 

U{ti2)  =  (1  +  6.87)/10  <  1  U{t22)  =  (1  +  2.5)/6  <  1  U[t^2)  =  (1  +  4.97)/9  <  1 

These  messages  are  schedulable  under  the  earliest  deadline  first  scheduling  discipline. 

3.2  Job  Sequencing 

When  a  new  job  is  admitted,  all  the  intermediate  nodes  (i.e.,  the  nodes  that  fall  between  the  source 
and  the  destination  of  the  message)  need  to  place  the  job  in  their  already  existing  job  sequence. 
Job  sequencing  is  performed  in  a  distributed  fashion  among  all  the  nodes. 

Consider  a  unidirectional  ring  with  n  nodes  JVi,lV2,...,iV„.  If  we  ignore  the  link  delay  (which 
can  be  done  as  described  in  the  previous  section),  then  a  message  output  at  iV,,  is  immediately 
available  at  iVj+i.  This  way  of  modeling  the  scheduling  is  helpful  because  the  link  delay  can  be 
factored  out  of  the  schedule  construction.  When  a  message  becomes  available  at  node  iVj+i,  the 
message  either  has  to  be  transmitted  immediately,  or  will  wait  in  the  bypass  buffer.  In  other 
words,  the  incoming  message  contends  for  the  outgoing  link  with  the  node’s  own  messages.  In  case 
of  contention,  the  job  sequence  determines  which  message  will  be  transmitted  first. 

Each  node  keeps  a  job  sequence  which  determines  the  priority  of  transmission  in  case  of  con¬ 
tention.  When  a  new  job  arrives,  either  from  this  node  or  from  an  upstream  node,  the  sequence 
will  be  recomputed  to  make  room  for  the  new  job.  Suppose  a  node  has  a  current  job  sequence 
s\S2—Ski  and  a  set  of  newly  arrived  jobs  riT2...r„,  as  shown  in  figure  5.  Then  the  newly  arrived 


Figure  5:  Job  sequencing  procedure. 

jobs  will  be  inserted  into  the  sequence  to  form  the  new  job  sequence.  The  mechanism  by  which  the 
new  schedule  is  decided  is  explained  below. 

In  order  to  compute  and  update  the  sequence  we  introduce  two  state  variables  per  node,  L[s,j] 
and  M[s,j].  L[s,j]  is  the  latest  time  by  which  job  has  to  be  scheduled  at  node  s,  so  that  its 
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deadline  can  be  met.  M[s,j]  is  the  earliest  schedulable  position  of  the  job  at  node  s  in  a  job 
sequence  so  that  the  jobs  before  this  one  can  meet  their  deadlines.  After  the  admission  of  a  new 
job,  these  two  variables  are  updated  in  a  distributed  fashion  as  described  in  the  following.  We  use 
D[s,j]  to  denote  the  deadline  of  the  job  at  node  s. 

D[s  +  l,j]-  L[s,j]  >  D[s  +  l,M[s  +  l,j]] 

M[s,j]  =  {l[s,  k]  -  frj  (tj,  r2, . . . ,  TjTk+i)  >  o} 

L[sJ]  <D[s  +  l,j]-Z>[s+l,M[s  +  l,j]]  (1) 

Equation  (1)  is  computed  incrementally  by  all  the  nodes  in  turn  and  the  new  values  of  L[s,  j]  and 
M[s,j]  are  computed.  This  fixes  the  job  sequence.  Each  job  in  the  sequence  keeps  the  information 
of  T[s,i]  and  M[s,j].  We  elaborate  the  idea  through  the  following  example. 


Example:  Consider  the  same  ringlet  as  shown  in  the  previous  example.  Consider  the  following 
job  arrivals. 


Job 

S 

T 

C 

D 

P 

n 

1 

3 

16 

80 

80 

T2 

2 

3 

80 

420 

420 

7-3 

2 

1 

16 

95 

95 

T4 

3 

1 

80 

510 

510 

T5 

1 

2 

16 

90 

90 

T6 

3 

1 

16 

72 

72 

rr 

1 

3 

80 

390 

390 

T8 

2 

3 

80 

440 

440 

Tg 

3 

2 

16 

77 

77 

"Tio 

2 

1 

16 

69 

69 

The  following  steps  show  the  computation  of  the  job  sequence  at  each  node. 


n  [16,1,3,80] 

Source  node:  Nodel: 
M[1,1]=0;  L[l,l]=64; 
Intermediate  node:  Node2: 
M[2,l]=0;  L[2,l]=64; 


Current  Scheduling: 

NULL; 

Current  Scheduling: 

NULL; 


T2  [80,2,3,420] 

Source  node:  Node2: 
M[2,2]=l;  L[2,2]=340; 


Current  Scheduling: 

n:  M[2,l]=0;  L[2,l]=64;  C[2,l]=16;  f(Ti,r2)=157.37 
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T3[16,2,1,95] 

Source  node:  Node2: 
M[2,3]=l;  L[2,3]=79; 


Intermediate  node:  Node3: 
M[3,3]=0;  L[3,3]=79 

T4[80,3,1,510] 

Source  node:  Node3: 
M[3,4]=l;  L[3,4]=430; 

T5[16,1,2,90] 

Source  node:  Nodel: 
M[l,5]=l;  L[l,5]=74; 

r6[16,3,l,72] 

Source  node:  Node3: 
M[3,6]=l;  L[3,6]=56; 

M[3,4]=2;  L[3,4]=430; 

T7[80,1,3,390] 

Source  node:  Nodel: 
M[l,7]=2;  L[l,7]=310; 

Intermediate  node:  Node2: 
M[2,7]=?;  L[2,7]=310; 


Current  Scheduling: 

Ti:  M[2,l]=0;  L[2,l]-64;  C[2,l]=16;  f(Ti,T2)=157.37 
ra:  M[2,2]=l;  L[2,2]=340;C[2,2]=80;f(Ti  ,T2,T3)=253.39 
Ti:  M[2,1]-0;  L[2,1]=64;  C[2,1]=16;  f(Ti,r3)=50.63 
ra:  M[2,3]=l;  L[2,3]=79;  C[2,3]=16;  f(ri,T3,T2)=253.39 
Ta:  M[2,2]=2;  L[2,2]=340; 

Current  Scheduling: 

NULL; 


Current  Scheduling: 

T3:  M[3,3]=0;  L[3,3]=79;  f(T3,r4)=142 


Current  Scheduling: 

n:  M[1,1]=0;  L[l,l]=64;  C[l,l]=16;  f(ri,T5)=51.44 


Current  Scheduling: 

T4:  M[3,4]=l;  L[3,4]=430;  f(T3,T4,T6)=247.3 
Te  need  schedule  before  T4; 

T3:  M[3,3]=0;  L[3,3]=79;  f(r3,T6)=53 
Te:  M[3,6]=l;  L[3,6]=56;  f(r3,r6,T4)=247.3 


Current  Scheduling: 

n:  M[1,1]=0;  L[l,l]=64;  C[0,l]=16;  f(Ti,T5)=51.44 
T5:  M[l,5]=l;  L[l,5]=74;  C[0,5]=16;  f(Ti,T5,T7)=268.58; 

Current  Scheduling: 

Ti:  M[2,l]=0;  L[2,l]=64;  C[2,l]=16;  f(Ti,T3)=50.63 
T3;  M[2,3]=l;  L[2,3]=79;  C[2,3]=16;  f(Ti,r3,r2)=253.39 
ra:  M[2,2]=2;  L[2,2]=340;  C[2,2]=80;  f(Ti,T3,T2,r7)=813.56 
Ti:  M[2,l]=0;  L[2,l]=64;  C[2,l]=16;  f(Ti,T3)=50.63 
T3:  M[2,3]-l;  L[2,3]=79;  C[2,3]=16;  f(Ti,T3,T7)=262.91 
T7:  M[2,7]=2;  L[2,7]=310;  C[2,3]=80;  f(Ti,T3,T7,r2)=813.56; 
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T7  Rejected;  Delete  T7. 


T8  [80,2,3,440] 

Source  node:  Node  2:  Current  Scheduling: 

M[2,8]=?;  L[2,8]=360;  n:  M[2,l]=0;  L[2,l]=64;  C[2,l]=16;  f(Ti,T3)=50.63 

73:  M[2,3]=l;  L[2,3]=79;  C[2,3]=16;  £{71,73, r2)=253.39 
72:  M[2,2]=2;  L[2,2]=340;  C[2,2]=80;  f(7i,73,72,78)=738.46 
=>  78  Rejected;  Delete  78 

T9[16,3,2,77] 

Source  node:  Node  3:  Current  Scheduling: 

M[3,9]=?;  L[3,9]=61;  73:  M[3,3]=0;  L[3,3]=79;  f(73,76)=53 

76:  M[3,6]=l;  L[3,6]=56;  f(73,76,79)=129.73 
=4-  Reject  79; 

7io[16,2,1,69] 

Source  node:  Node2:  Current  Scheduling: 

M[2,10]=?;  L[2,10]=53;  n:  M[2,l]=0;  L[2,l]=64;  C[2,l]=16;  f(7i,73)=50.63 

73:  M[2,3]=l;  L[2,3]=79;  C[2,3]=16;  f(7i,73,7io)=80 
=>  Reject  710- 

Therefore  the  accepted  job  sequences  are: 

Node  1  Node  2  Node  3 


:  {M=0,  L=64;  C=16} 

ri: 

{M=0,  L=64;  C=16} 

7-3: 

{M=0,  L=79,  C=16} 

:  {M=l,  L=74;  C=16} 

T3: 

{M=l,  L=79,  C=16} 

Te: 

{M=l,  L=56,  C=16} 

:  {M=2,  L=310;  C=80} 

T2: 

{M=2,  L=340,  C=80} 

74: 

{M=2,  L=430,  C=80} 

Utilization  =  0.583  Utilization  =  0.559  Utilization  =  0.548 

4  Application  of  Popular  Real-time  Schemes 

In  a  distributed  real-time  system,  the  problem  of  routing  a  set  of  messages  through  the  network  so 
that  each  message  can  be  sent  on-time  is  an  important  issue.  There  are  several  real-time  scheduling 
algorithms  proposed  [3]  that  deal  with  this  problem.  Although  rate  monotonic  scheduling  (RMS) 
[10]  is  a  seminal  work  in  real-time  scheduling,  it  does  not  scale  well  in  a  distributed  system.  Several 
heuristic  algorithms  have  been  proposed  in  the  literature  [8,  9].  In  this  section,  we  describe  some  of 
the  popular  ones.  Each  of  them  can  solve  some  restricted  cases  of  the  problem.  These  algorithms 
are  studied  for  alternative  SCI/RT  schemes  and  are  used  as  references  of  comparison  with  our  job 
packing  algorithm. 
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4.1  Earliest  Available  First  Algorithm 

RMS  is  not  the  only  priority  scheduling  theory.  Depend  on  how  the  priority  is  defined,  different 
scheduling  algorithms  can  be  devised.  Earliest  Available  First  algorithm  (EAF)  defines  the  priority 
level  according  to  a  message’s  ready  time.  The  earlier  the  message  is  ready,  the  higher  the  priority 
it  gets.  This  algorithm  is  the  simplest  scheduling  algorithm,  similar  to  FCFS.  The  algorithm  works 
as  follows:  Whenever  a  node  is  ready  for  transmission,  it  sends  the  earliest  available  message.  Ties 
are  broken  arbitrarily.  A  set  of  messages  with  identical  origin  nodes,  destination  nodes,  release 
times,  and  deadlines  is  feasible  with  respect  to  non-preemptive  transmission  if  and  only  if  EAF  is 
feasible  [8,  9].  If  there  is  no  restriction  on  the  origin  nodes,  destination  nodes,  or  release  times, 
EAF  cannot  give  the  guarantee  that  an  accepted  message  can  meet  its  deadline. 

4.2  Earliest  Deadline  First  Algorithm 

Earliest  Deadline  First  algorithm  (EDF)  assigns  a  job’s  priority  according  to  its  deadline.  The 
intuition  behind  EDF  is  to  send  the  message  that  has  earliest  deadline  first  in  order  to  avoid 
missing  its  deadline.  The  algorithm  is  follows:  Whenever  a  node  is  ready  for  transmission,  it  sends 
the  message  with  minimum  deadline  among  all  the  messages  present  at  that  time.  Ties  can  be 
broken  arbitrarily.  It  has  also  been  shown  [8,  9]  that  EDF  is  the  optimal  algorithm  if  the  messages 
have  identical  origin  nodes,  destination  nodes,  and  release  times. 

4.3  Smallest  Slack  Time  First  Algorithm 

Slack  time  of  a  real-time  job  is  defined  as  the  laxity  of  the  job,  i.e.,  in  our  terminology,  P  —  C. 
Smallest  Slack  Time  First  Algorithm  (SSF)  treats  the  job  with  the  smallest  slack  time  as  the  most 
critical  job,  and  assigns  it  the  highest  priority.  The  underlying  assumption  is  bigger  the  slack  time, 
more  deadline  tolerance  the  job  is.  The  basic  algorithm  is  as  follows:  Whenever  a  node  is  ready 
for  transmission,  it  sends  the  message  with  the  smallest  slack  time,  among  all  the  messages  present 
at  that  time.  Ties  can  be  broken  arbitrarily.  As  shown  in  [9],  the  SSF  algorithm  is  optimal  for  a 
set  of  messages  with  identical  origin  nodes,  and  is  optimal  for  the  set  of  messages  with  identical 
release  times. 

4.4  Farthest  Away  First  Algorithm 

Farthest  Away  First  (FAF)  algorithm  considers  the  distance  from  the  source  to  the  destination. 
A  job  that  is  far  away  from  its  destination  is  assigned  higher  priority  since  it  needs  more  time  to 
pass  through  the  longer  path.  The  FAF  algorithm  works  as  follows:  Whenever  a  node  is  ready  for 
transmission,  it  sends  the  message  with  the  farthest  destination.  Ties  can  be  broken  arbitrarily.  In 
[8]  it  has  been  proven  that  FAF  is  optimal  for  a  set  of  messages  with  identical  deadlines. 
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5  Numerical  Results  and  Comparison 


In  this  section  we  describe  our  experimentation  model  for  SCI/RT,  present  the  results  obtained 
and  discuss  the  results.  We  also  compare  the  relative  performance  of  various  SCI/RT  schemes. 

5.1  Simulation  Model 

In  order  to  study  the  feasibility  and  the  performance  of  different  candidate  real-time  schemes  for 
SCI  and  our  scheme,  we  have  built  a  detailed  simulation  model  for  each  of  them.  We  started 
with  a  baseline  simulation  model  developed  at  the  University  of  Wisconsin  [15].  It  is  a  time- 
driven  simulator  that  simulates  only  the  base  SCI  packet  transmission  protocol  (without  the  cache 
coherence  protocol)  in  a  single  SCI  ring  with  multiple  nodes.  It  uses  simplified  buffer  management, 
i.e.,  single  transmit  and  receive  queue  per  SCI  node.  The  simulator  was  extended  by  the  author 
during  his  summer  stay  at  the  Wright  Laboratory  as  a  participant  of  the  AFOSR  Summer  Faculty 
Research  Program.  The  simulator  developed  in  [11]  was  extended  to  implement  different  plausible 
SCI/RT  protocols.  In  the  following  we  outline  only  the  main  features  of  each  of  the  simulation 
models.  Refer  to  [11]  for  a  detailed  report  on  the  simulator. 

Job  Packing  Scheme:  We  have  designed  and  developed  a  simulator  for  the  job  packing  algo¬ 
rithm.  The  simulator  consists  of  three  essential  parts,  namely  the  job  admission  controller, 
the  job  sequencer,  and  the  job  scheduler.  The  job  admission  controller  performs  all  the  tests 
described  in  section  3,  and  keeps  the  necessary  data  structures  up-to-date.  The  job  sequencer 
distributes  the  information  regarding  the  new  job  to  all  the  nodes  and  computes  the  job  se¬ 
quences  in  a  pseudo-distributed  (i.e.,  distributed  in  a  simulated  environment)  manner.  The 
function  of  the  job  scheduler  is  to  schedule  the  jobs  in  the  simulated  SCI  ring.  It  uses  part 
of  the  idle  symbols  for  distributing  scheduling  related  information  to  all  the  nodes. 

Candidate  SCI/RT  Schemes:  From  the  candidate  SCI/RT  protocol  suite,  we  have  selected  the 
train  protocol  and  the  2-bit/8-bit  priority  protocols.  Since  the  8-bit  protocol  is  very  similar  to 
the  train  protocol,  we  simulate  only  the  2-bit  part  of  the  priority  protocol.  The  train  protocol 
requires  a  significant  change  on  the  baseline  model  in  order  to  simulate  the  LocalMotive  and 
the  TicketMaster.  The  2-bit  protocol  is  implemented  from  its  specification. 

Popular  Real-time  Schemes:  We  have  built  a  simulator  to  study  any  priority  based  scheduling 
discipline  (refer  to  section  4)  on  the  SCI/RT  platform.  We  use  this  platform  to  study  the 
EAF,  EDF,  SSF  and  FFA  algorithms  described  in  section  4.  Note  that  all  the  protocols  in  this 
suite  are  different  variations  of  priority  based  scheduling.  They  differ  in  the  way  the  priority 
per  job  is  computed  and  assigned.  We  developed  algorithm  specific  priority  computation 
procedures  and  plug  them  in  the  corresponding  scheme. 
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The  simulators  are  tested  rigorously  before  they  were  used  for  the  performance  study.  We  con¬ 
duct  the  performance  study  with  synthetically  generated  real-time  job  workload.  The  workload 
generation  process  is  described  next. 

5.2  Workload  Generation 

In  order  to  evaluate  the  performance  of  different  protocols  and  to  compare  the  performance  of  the 
proposed  job  packing  scheme,  we  created  different  sets  of  workloads.  The  simulators  were  subjected 
to  each  set  of  workloads.  Each  workload  consists  of  a  set  of  jobs.  A  job  could  be  either  periodic  or 
aperiodic.  The  periodic  jobs  are  representative  of  sensor  generated  data,  while  the  aperiodic  jobs 
characterize  one  time  operation  like  interrupt  processing.  They  are  both  representative  of  real-time 
traffic  [18]. 

Periodic  Workload:  Each  job  set  contains  1000  periodic  jobs,  with  an  average  job  utilization  of 
p  =  Yli  Ci/ Pi-  The  value  of  p  is  varied  over  the  job  sets.  Abiding  by  the  standards  of  SCI 
packet  sizes,  we  make  the  computation  time  of  a  job  {C)  equivalent  to  16  (command/address), 
80  (16  bytes  address  -f  64  byte  data),  and  272  (16  bytes  address  -f  256  bytes  data)  time  units, 
where  one  time  unit  represents  the  time  needed  to  transmit  one  symbol.  In  our  job  set,  the 
computation  time  of  each  job  is  selected  randomly  from  16,  80,  and  272.  Once  the  value  of 
C  is  selected,  we  go  on  to  choosing  the  value  of  P.  This  value  is  chosen  in  such  a  way  so 
that  the  value  of  C/P  of  each  job  falls  randomly  within  [p  —  r,  p  +  t],  where  r  is  a  tunable 
parameter.  Each  job  is  assigned  a  source  and  a  destination  node  randomly  from  all  the  nodes 
connected  in  the  ring. 

Aperiodic  Workload:  Aperiodic  job  sets  are  created  from  periodic  job  sets  by  assuming  that  a 
job  does  not  repeat.  We  use  D  =  P  to  define  the  deadline  of  an  aperiodic  job.  Each  job  in 
an  aperiodic  job  set  is  assigned  a  randomly  selected  arrival  time. 

5.3  Comparative  Study  of  Job  Packing  with  Train  and  2-bit  Protocols 

We  conducted  several  sets  of  experiments  to  study  and  evaluate  the  performance  of  the  job  packing 
algorithm  with  train  and  2-bit  protocols.  In  a  real-time  environment  since  we  are  more  concerned 
with  a  job’s  deadline  guarantee  rather  than  fairness  and  forward  progress,  we  use  different  sets  of 
performance  metrics,  namely  job  reject  ratio  and  average  node  utilization,  and  study  them  as  a 
function  of  load  on  the  ring.  We  express  the  load  on  the  ring  in  terms  of  cumulative  job  utilization  of 
a  job  set.  It  is  defined  as  X]  Ci/Pi  (i.e.,  p)  for  all  the  jobs  present  in  the  set.  Job  reject  ratio  defines 
the  fraction  of  jobs  that  were  rejected  by  a  particular  protocol  since  their  deadlines  cannot  be  met. 
The  average  node  utilization  is  time  averaged  over  the  simulation  duration.  The  experiments  are 
classified  according  to  the  workload  used,  and  are  described  below. 
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Job  Reject  Ratio  vs.  Cumulative  Job  Utilization 


random  C/P:  (+/-  5%) 


Job  Reject  Ratio  vs,  Cumulative  Job  Utilization 

random  C/P  :  [+/-  25%] 


Average  Node  Utilization  vs.  Cumulative  Job  Utilization 


random  C/P:  (+/-  5%] 


Average  Node  Utilization  vs.  Cumulative  Job  Utilization 


random  C/P:[  +/-  25%) 


Figure  6:  Performance  of  Job  Packing,  Train,  2-Bit  protocols  for  periodic  job  set  (Part  I). 

Periodic  Workload:  Our  first  set  of  experimentation  used  the  periodic  workload  as  the  input  to 
the  real-time  SCI  protocols,  and  the  results  are  plotted  in  figures  6  and  7.  Each  pair  of  graphs  show 
the  job  reject  ratio  and  the  average  node  utilization  as  a  function  of  cumulative  job  utilization. 
Different  pairs  of  graphs  show  the  simulation  results  with  different  degree  of  randomness  in  the 
workload  job  set  (i.e.,  r).  The  following  observations  can  be  made  from  the  figures: 

•  Job  reject  ratio  is  the  lowest  with  job  packing  algorithm,  and  the  highest  for  train  protocol, 
with  2-bit  protocol  in  between.  This  is  due  to  the  fact  that  the  job  packing  algorithm  tries  to 
accommodate  as  many  jobs  as  possible  through  local  (very  low  overhead)  job  admission,  and 
global  job  sequencing  (more  overhead).  It  can  move  jobs  around  in  the  sequence  so  that  more 
new  jobs  can  get  in.  This  results  in  low  job  reject  ratio.  The  train  protocol,  on  the  other 
hand,  wastes  a  significant  amount  of  ring  resource  in  maintaining  and  circulating  the  train 
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Job  Reject  Ratio  Job  Reject  Ratio 


Job  Reject  Ratio  vs.  Cumulative  Job  Utilization  Average  Node  Utilization  vs.  Cumulative  Job  Utilization 


Cumulative  Job  Utilization  Cumulative  Job  Utilization 

Job  Reject  Ratio  vs.  Cumulative  Job  Utilization  Average  Node  Utilization  vs.  Cumulative  Job  Utilization 


Cumulative  Job  Utilization  Cumulative  Job  Utilization 


Figure  7:  Performance  of  Job  Packing,  Train,  2-Bit  protocols  for  periodic  job  set  (Part  II). 

over  the  ring.  A  job  has  to  wait  at  least  one  round  trip  before  it  gets  permission  (or  rejection) . 
This  extra  overhead  forces  the  train  protocol  to  reject  more  jobs.  The  2-bit  protocol,  with  its 
limited  priority  levels,  cannot  accept  a  lot  of  jobs.  Since  its  overhead  is  lot  lower  than  train 
protocol,  it  performs  better. 

•  Average  node  utilization  is  highest  for  job  packing  algorithm,  and  lowest  for  train  protocol, 
with  2-bit  protocol  in  between.  This  behavior  can  be  explained  from  the  job  reject  ratio.  More 
jobs  a  protocol  admits,  more  utilization  a  node  will  achieve  for  the  corresponding  protocol. 

•  As  the  randomness  in  the  job  set  increases,  job  packing  algorithm  performs  even  better  (i.e. 
lower  job  reject  ratio  and  higher  node  utilization).  This  is  due  to  the  fact  that  randomness 
in  job  parameters  allows  the  job  packing  algorithm  to  make  the  packing  tighter.  In  other 
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words,  during  the  job  sequencing  phase,  there  is  more  flexibility  in  moving  jobs  around  and 
this  results  in  higher  job  acceptance  rate.  This  flexibility  cannot  be  exploited  by  train  or 
2-bit  protocols. 


Aperiodic  Job  Simuiation  Result 


Aperiodic  Job  Simuiation  Resuits 


Deadline  Miss  Ratio  vs.  Job  Arrival  Rate  (avg.  job  utl.=0.01) 


O.CX)  0.10  0.20  0.30  0.40  0.50 

Job  Arrival  Rate 


Figure  8:  Performance  of  Job  Packing,  Train,  2-Bit  protocols  for  aperiodic  job  set. 


Aperiodic  Workload:  The  next  set  of  experiments  use  the  aperiodic  job  sets  as  the  workload. 
Note  that  cumulative  job  utilization  does  not  make  the  same  sense  in  this  context  as  it  does  for 
periodic  jobs.  Instead  we  use  job  arrival  rate  to  define  the  intensity  of  workload.  The  arrival 
rate  is  measured  in  time  units  of  symbol  time^  to  make  it  independent  of  link  bandwidth.  We 
assume  that  the  jobs  arrive  according  to  a  Poisson  arrival  process.  The  results  obtained  from  the 
simulation  are  plotted  in  figure  8.  Deadline  miss  ratio  is  defined  as  the  fraction  of  jobs  that  miss 
their  deadlines,  which  is  the  most  important  metric  in  their  schedule.  Observe  from  the  figure 
that  at  low  aperiodic  job  arrival  rate  (i.e.,  low  load)  the  train  protocol  has  the  lowest  miss  ratio, 
job  packing  algorithm  being  the  highest  and  2-bit  protocol  in  between.  However,  just  the  reverse 
sequence  can  be  observed  at  higher  job  arrival  rate  (i.e.,  high  load).  The  results  can  be  explained 
again  by  the  philosophy  behind  the  design  of  each  of  these  protocols.  Since  job  packing  algorithm 
tries  to  pack  jobs  as  compactly  as  possible,  in  lower  job  arrival  rate  it  does  not  perform  very 
good  since  there  is  nothing  much  to  pack  (as  the  jobs  do  not  repeat).  Whereas  both  train  and 
2-bit  protocols  use  the  lightly  loaded  ring  to  send  whatever  job  is  coming,  as  quickly  as  possible. 
However,  as  the  load  increases,  the  overhead  of  train  protocol  and  insufficiency  of  2-bit  protocol 
priority  levels  become  more  prominent  and  they  fail  to  guarantee  the  deadline.  In  this  scenario, 
job  packing  algorithm  works  very  well  since  it  is  able  to  construct  the  sequence  more  appropriately. 

®One  symbol  time  is  defined  as  the  time  it  tcikes  to  transmit  one  symbol  over  the  ring. 
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This  load  sensitivity  of  the  job  packing  algorithm  is  a  desirable  feature  for  real-time  job  scheduling 
over  SCI. 


5.4  Comparative  Study  of  Job  Packing  with  Popular  Real-Time  Schemes 


In  this  section  we  present  our  experimental  results  on  the  performance  of  popular  real-time  message 
scheduling  algorithms  on  a  SCI  ring.  We  used  similar  experimental  setup  with  different  classes  of 
workload  as  we  did  in  the  previous  section.  Below  we  describe  and  analyze  the  results  obtained  for 
each  of  the  categories.  We  also  compare  the  results  with  the  job  packing  algorithm. 

Job  Reject  Ratio  vs.  Cumulative  Job  Utilization  Average  Node  Utilization  vs.  Cumulative  Job  Utilization 


random  C/P:  [+/-  S%]  random  C/P;  [+/-  5%J 


Cumulative  Job  Utilization  Cumulative  Job  Utilization 


Job  Reject  Ratio  vs,  Cumulative  Job  Utilization  Average  Node  Utilization  vs.  Cumulative  Job  Utilization 

random  C/P :  [+/-  25%]  random  C/P;[  +/-  25%) 


Cumulative  Job  Utilization  Cumulative  Job  Utilization 


Figure  9:  Performance  of  Job  Packing,  EAF,  EDF,  SSF  and  FFA  for  periodic  job  set  (Part  I). 


Periodic  Workload:  The  workloads  and  the  performance  metrics  used  in  this  set  of  experiments 
are  the  same  as  what  were  used  before  (train  and  2-bit  protocols).  We  evaluate  and  compare  the 
performance  of  EAF,  EDF,  SSF  and  FFA  schemes  with  the  job  packing  algorithm.  The  results  are 
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Job  Reject  Ratio  vs.  Cumulative  Job  Utilization 


random  C/P;  [+/-  35%] 


Job  Reject  Ratio  vs.  Cumulative  Job  Utilization 

random  C/P;[+/-  45%] 


Average  Node  Utilization  vs.  Cumulative  Job  Utilization 


random  C/P;  [+/-  35%] 


Average  Node  Utilization  vs.  Cumulative  Job  Utilization 


random  C/P;  [+/-  45%] 


Figure  10:  Performance  of  Job  Packing,  EAF,  EDF,  SSF  and  FFA  for  periodic  job  set  (Part  II). 

plotted  in  figures  9  and  10.  General  conclusions  drawn  from  these  figures  are  the  following: 

1.  Simple  algorithms  like  EAF,  which  is  a  variation  of  FCFS  service  discipline,  does  not  work 
well  in  a  real-time  environment. 

2.  FFA,  which  depends  only  on  the  destination,  but  ignores  the  deadlines  of  the  jobs  fails  to 
capture  the  real-time  requirements  of  the  jobs. 

3.  Both  EDF  and  SSF  work  well  in  a  real-time  environment  since  both  of  them  are  sensitive  to 
the  deadline  (and  computation  time  for  SSF).  However,  the  algorithms  may  fail  to  guarantee 
message  deadline  at  high  load  because  they  are  not  able  to  change  the  job  priorities  adaptively 
with  load. 

4.  The  job  packing  algorithm  works  well  in  a  real-time  environment.  Although  EDF  and  SSF 
work  better  than  job  packing  at  low  load,  the  role  reverses  with  increase  in  load  and  degree 
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of  randomness  in  workload.  Job  packing  algorithm  can  exploit  the  flexibility  to  maneuver  the 
job  sequence  and  change  their  priorities  dynamically  by  resequencing  the  jobs. 

The  average  node  utilization  is  a  direct  manifestation  of  the  effect  of  job  reject  ratio,  and  can  be 
explained  in  a  similar  way. 

Aperiodic  Job  Simulation  Result  Aperiodic  Job  Simulation  Results 

Deadline  Miss  Ratio  vs.  Job  Arrival  Rate  (avg.  job  uti,=0,01)  Deadline  Miss  Ratio  vs.  Job  Arrival  Rate  (avg.  job  uti=0.02) 


Figure  11:  Performance  of  Job  Packing,  EAF,  EDF,  SSF  and  FFA  protocols  for  periodic  job  set. 


Aperiodic  Workload:  We  use  the  same  aperiodic  workload  for  this  set  of  experiments.  The 
results  are  plotted  in  flgure  11.  A  trend  similar  to  the  one  observed  for  the  previous  set  of  exper¬ 
iments  can  be  observed  here  as  well.  At  low  load  job  packing  performs  not  as  good  as  others.  At 
high  load,  due  to  the  load  sensitivity  feature  of  job  packing  scheme,  it  performs  much  better  than 
the  rest. 

5.5  Discussion 

The  experiments  reveal  that  the  train  protocol  suffers  from  high  maintenance  overhead,  whereas 
2-bit  protocol  may  fall  short  in  providing  sufficient  priority  levels.  Simple  protocols  like  EAF  and 
FAF  do  not  work  well  in  a  real-time  environment.  EDF  and  SSF  perform  well  during  low  load,  but 
their  high  load  performance  is  not  so  good  since  their  job  priority  scheme  is  not  load  sensitive.  An 
algorithm  that  is  load  sensitive  and  is  able  to  dynamically  prioritize  the  real-time  jobs  is  well  suited 
in  the  SCI/RT  environment.  The  job  packing  algorithm  is  an  ideal  candidate  for  that.  However, 
we  would  like  to  mention  here  that  the  current  version  of  the  proposed  algorithm  has  moderately 
high  overhead  in  the  job  sequencing  phase.  More  work  needs  to  be  done  to  lower  the  complexity 
and  make  it  more  amenable  to  run  online. 
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6  Conclusion  and  Future  Research 


In  this  document  we  have  reported  the  work  conducted  on  real-time  message  transmission  over 
Scalable  Coherent  Interface.  The  main  thrust  of  the  work  was  to  study  the  performance  of  different 
SCI/RT  candidate  schemes  and  the  suitability  of  some  of  the  popular  real-time  message  delivery 
techniques  applied  to  the  SCI  paradigm.  The  study  is  made  through  extensive  simulation  of  all  these 
schemes.  We  observe  that  different  schemes  suffer  from  different  limitations,  and  conclude  that  a 
flexible,  load  sensitive  scheme  is  well  suited  for  SCI/RT.  In  this  regard  we  have  developed  a  new  real¬ 
time  message  scheduling  protocol  over  SCI,  called  the  job  packing  algorithm.  We  have  conducted 
simulation  study  using  real-time  workload,  and  have  shown  the  superiority  of  the  proposed  scheme. 

The  current  version  of  the  job  packing  protocol,  although  shows  great  potential,  suffers  from 
moderately  high  overhead  in  the  job  sequencing  phase  and  hinders  its  online  implementation.  At 
the  time  of  developing  this  report,  we  had  several  conjectures  regarding  the  job  packing  algorithm. 
We  have  verified  them  through  experimentation,  but  theoretical  proofs  are  yet  to  be  developed.  We 
plan  to  revise  the  algorithm  to  make  it  amenable  for  online  use,  specify  the  detail  protocol  steps 
and  complete  the  theoretical  study.  We  will  carry  that  as  our  future  work  in  this  direction. 
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Accurate  Calibration  of  High  Temperature  Superconductor  (HTS) 

Dielectric  Resonator  Measurements 

Krishna  Naishadham 
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Dayton,  OH  45435 

Abstract 

Dielectric  resonators  (DRs),  formed  by  sandwiching  a  cylindrical  piece  of  polished  dielectric  mate¬ 
rial  (sapphire)  between  two  planar  HTS  thin-films,  offer  an  attractive  platform  for  microwave  testing 
of  HTS  materials.  They  also  find  dual  use  as  components  in  low-noise  microwave  receivers.  We 
have  measured  each  DR  as  a  two-port  system,  by  exciting  and  detecting  the  modal  fields  with  loop- 
terminated  coaxial  cables.  The  observed  Q  factor  of  the  resonator  is  a  gauge  of  the  surface  resistance 
of  the  endplates,  an  important  property  for  the  microwave  characterization  of  HTS  thin  films.  In  HTS 
DR  measurements,  because  of  the  extremely  high  Q’s  (of  the  order  of  10®)  resulting  from  very  low 
dissipation,  the  measured  parameters  are  very  sensitive  to  the  background  “noise”  contributed  by  the 
coupling  mechanism,  fixturing  case  modes,  radiation,  etc.  Therefore,  it  becomes  important  to  properly 
calibrate  out  aU  the  parasitics  of  the  fixture  in  order  to  accurately  measure  the  unloaded  Q  factor  of 
the  resonator.  In  this  paper,  we  report  an  accurate  calibration  procedure  based  on  the  application  of 
least  squares  minimization  with  convergence  enhanced  by  the  non-linear  Marquardt  algorithm.  As  it 
is  impossible  for  the  loop-coupling  mechanism  to  employ  traditional  hardware  calibration  applicable 
to  network  analyzer  measurements,  we  have  alternatively  developed  this  software  calibration  approach 
to  effectively  filter  out  the  background  noise  and  extract  the  unloaded  Q  of  the  DR.  We  have  developed 
a  computer  program  in  LabWindows  C  to  directly  interface  with  the  network  analyzer  and  extract  the 
HTS  parameters  of  interest  using  the  calibration  algorithm.  We  discuss  the  utibzation  of  this  method 
in  the  characterization  of  HTS  DRs  with  small  area  thin-films,  at  frequencies  in  the  20-40  GHz  range, 
and  at  cryogenic  temperatures. 
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1  Introduction 


The  discovery  of  high  temperature  superconductivity  in  LaBaCuO  at  30K  by  Bednorz  and  Muller 
(1986)  [1],  and  in  YBaCuO  (YBCO)  at  temperatures  above  90K  by  Chu  and  several  others  (1987) 
[2],  has  significant  impact  on  the  design  of  microwave  systems.  Because  of  extremely  small  losses  (or 
high  Q-factor),  low  noise,  low  power  consumption,  potential  for  circuit  miniaturization,  high  critical 
current  densities,  and  uniform  small-signal  behavior  over  a  wide  temperature  range,  high  temperature 
superconductor  (HTS)  materials  are  becoming  increasingly  useful  in  aerospace  industry,  where  size, 
weight  and  performance  need  to  be  optmized.  Several  designs  of  passive  HTS  microwave  circuits,  such 
as  ultra  low-loss  transmission  lines,  sharp-skirt  microwave  filters,  high-gain  antenna  arrays,  etc.,  have 
been  reported  [3].  In  addition,  HTS  materials  exhibit  non-linear  field  effects  at  the  macroscopic  level 
{e.g.,  Josephson  tunneling  eff’ect),  which  made  possible  a  number  of  active  devices,  such  as  field  effect 
transistor  (FET)  and  heterojunction  bipolar  transistor  (HBT),  operating  with  improved  performance 
over  their  room-temperature  normal  conductor  counterparts  [4]. 

Most  of  the  microwave  applications  of  HTS  materials  employ  thin-film  technology  in  contrast  to 
bulk  materials.  An  important  microwave  electrical  property  of  the  HTS  film  is  the  surface  resis¬ 
tance,  which  determines  the  dissipation  in  microwave  devices,  and  hence  the  Q.  Dielectric  resonators 
are  attractive  as  characterization  tools  for  the  determination  of  surface  resistance  of  superconducting 
(particularly  HTS)  thin  films.  It  is  possible  to  form  a  resonator  using  only  two  planar  films  and  a  sap¬ 
phire  cylinder,  yet  high  sensitivity  is  attainable.  High-Q  resonators  have  dual  use  as  characterization 
tools  and  as  components  in  microwave  systems  (e.g.,  sharp-skirt  filters  and  low-noise  oscillators).  In 
the  latter  application,  it  becomes  very  important  to  minimize  the  losses  in  the  DR  package.  In  this  re¬ 
port,  we  address  the  challenges  associated  with  accurate  experimental  characterization  of  (miniature) 
sapphire  dielectric  resonators  of  approximately  Icm^  HTS  endplate  areas,  a  criterion  which  would 
make  the  design  particularly  atractive  to  aerospace  applications.  The  utility  of  such  characterization 
in  non-destructive  testing  of  HTS  material  samples  is  evident.  Measurement  of  surface  resistance  is 
very  important  for  material  and  circuit  optimization  in  microwave  applications.  Sapphire  is  an  attrac¬ 
tive  substrate  dielectric  material  for  this  measurement  because  of  its  low  loss  tangent  and  moderate 
dielectric  constant. 

We  measure  each  DR  as  a  two-port  system,  by  exciting  and  detecting  the  modal  fields  with  loop- 
terminated  coaxial  cables.  The  observed  Q  factor  of  the  resonator  is  a  gauge  of  the  surface  resistance 
of  the  endplates.  Our  objective  of  testing  superconducting  samples  adds  the  complication  of  cooling 
the  resonator  to  cryogenic  temperatures.  Because  the  resonant  frequency  shifts  substantially  as  the 
temperature  is  varied,  and  because  the  properties  of  the  test  cables  and  coupling  loops  vary  with 
temperature,  maintaining  accurate  calibration  is  diflRcult.  The  fact  that  a  loop-coupled  DR  cannot 
be  experimentally  calibrated  adds  to  the  complexity  of  parameter  extraction.  The  loop  coupling 
fixture  is  inherently  difiicult  to  compensate,  because  multiple  measurements  such  as  thru-refiect-load 
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(TRL)  cannot  be  accomplished  at  cryogenic  temperatures  within  reasonable  accuracy.  Besides,  the 
aggravation  of  making  these  additional  measurements  at  several  temperatures  precludes  their  utility. 
In  this  paper,  we  discuss  a  method  for  the  software  calibration  of  two-port  resonator  data,  which  is 
capable  of  compensating  for  the  background  noise  resulting  from  attenuation,  multiple  reflections  and 
dispersion  introduced  by  cables  and  discontinuities  leading  to  the  resonator.  The  proposed  method 
of  DR  measurement  and  calibration  is  more  accurate  than  the  insertion  loss  measurements  typically 
reported  in  previous  investigations  [5]  -  [8],  because  both  magnitude  and  phase  of  the  four  two-port 
S-parameters,  measured  at  several  frequencies  using  a  vector  network  analyzer,  are  employed  to  fit 
Q-circles  [9]  to  the  measured  data. 

The  research  reported  in  this  paper  has  been  performed  collaboratively  with  WL/MLPO  as  part 
of  the  AFOSR  SREP  project.  The  HTS  thin  films  have  been  grown  over  sapphire  substrates  by  pulsed 
laser  deposition  (PLD).  The  microwave  measurements  have  been  performed  on  cryogenicaUy  cooled 
HTS  DR  over  several  frequency  bands  in  the  20  -  40  GHz  range,  using  a  vector  network  analyzer. 
The  details  of  microwave  measurement  procedure  are  covered  in  [10]  -  [12],  and  are  not  dealt  herein. 
The  reported  research  met  the  following  two  objectives  set  forth  in  the  pertaining  proposal:  (a)  to 
develop  a  robust  and  accurate  curve-fitting  procedure  (referred  henceforth  as  software  calibration)  for 
extracting  the  surface  resistance  from  inherently  noisy  measurements  of  5-parameters  of  the  films, 
(b)  to  develop  a  convenient  graphical  user  interface  (GUI)  based  on  LabWindows,  which  facilitates 
automatic  processing  of  measured  data  sets  spanning  several  frequencies  and  temperatures.  The  GUI 
is  stiU  being  refined  to  accommodate  more  fitting  functions,  and  make  the  software  a  user-friendly, 
comprehensive  package  for  calibrating  any  two-port  resonator  measurements  (not  just  DR)  applicable 
to  the  characterization  of  HTS  materials.  It  is  anticipated  that  this  research  will  result  in  improved 
dielectric  resonator  measurements  by  providing  a  better  understanding  of  the  minimization  of  parasitic 
noise  sources  in  the  measurement  process. 

The  report  is  organized  as  follows.  The  next  section  presents  background  information  on  our  pre¬ 
liminary  research  on  the  analysis  of  DR  measurements.  The  fundamental  technical  approach  followed 
in  the  project  is  based  on  electromagnetic  analysis  of  dielectric  resonators,  and  will  be  detailed  in 
Sec.  3.  The  first  subsection  will  provide  the  experimental  details,  and  the  second  dwells  upon  the 
electromagnetic  field  analysis  of  the  resonator.  Sec.  4  presents  the  equivalent  circuit  modeling  of 
measured  data  and  introduces  the  concept  of  Q-circles.  Some  earlier  approaches  to  the  extraction  of 
parameters  from  DR  measurements,  and  their  limitations,  are  also  discussed.  Sec.  5  describes  the 
least  squares  Marquardt  (LSM)  curve-fitting  procedure,  followed  by  a  discussion  of  sample  results 
derived  from  implementation  of  the  LSM  algorithm,  in  Sec.  6.  Routine  programmatic  details  on  the 
computer  implementation  are  avioded. 
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2  Background  Research 

The  software  calibration  algorithm  is  based  on  microwave  circuit  theory  for  tuned  resonators  [9],  [13], 
and  consists  of  curve-fitting  Q-circles  to  the  measured  data.  In  HTS  DR  measurements,  the  measured 
parameters  are  very  sensitive  to  the  background  “noise”  contributed  by  the  fixture  parasitics  such  as 
coupling  losses,  case  modes,  radiation,  etc.  As  part  of  the  AFOSR  1996  Summer  Research  Program 
[14],  we  have  developed  a  non-linear  curve-fitting  procedure,  the  LSM  algorithm,  to  accurately  filter 
out  the  noise  and  extract  the  unloaded  Q  of  a  DR.  It  has  been  observed  that  the  noise  manifests  out 
of  the  resonant  band  as  a  quasi-sinusoidal  envelope.  Within  the  resonant  band,  ideally,  the  measured 
data  can  be  fit  to  a  linear  fractional  transformation  corresponding  to  mapping  of  a  pure  Lorentzian 
into  Q-circles  in  the  complex  plane  [9].  In  order  to  represent  the  nearly  harmonic  noise,  we  multiply 
the  Lorentzian  expression  by  a  sum  of  complex  exponentials  pertaining  to  the  standing  wave  modes 
of  the  loop-coupling  mechanism.  These  modes  render  the  fit  non-linear.  We  have  implemented  this 
composite  non-linear  transformation  in  a  computer  program  developed  using  Lab  Windows  C  (Program 
SoftCal,  Version  1),  and  used  it  to  model  raw  data  from  a  DR  with  copper  end-plates  [13].  The  results 
presented  in  [13]  for  the  surface  resistance  of  thin  copper  plates  have  demonstrated  the  feasibility  of 
software  calibration  of  noise-corrupted  DR  measurements.  Because  measurements  were  not  available 
for  HTS  films  at  that  time,  the  program  has  not  been  tested  with  very  high  Q’s.  However,  our 
preliminary  analysis  shows  that  the  LSM  fit  is  quite  satisfactory  even  in  the  case  of  highly  corrupted 
data.  We  will  later  present  salient  features  of  the  LSM  algorithm  with  examples  on  curve-fitting. 

A  major  drawback  of  SoftCal  1  program  is  that  it  does  not  map  out  measured  data  into  perfect 
circles  in  the  complex  plane,  because  aU  the  data  and  the  noise  are  fitted  to  the  same  model.  Therefore, 
if  the  data  is  very  noisy  and  forms  open  loops  instead  of  closed  circles,  the  LSM  method  cleans  out 
the  noise,  but  still  leaves  the  loops  open.  We  have  resorted  to  visual  interpolation  of  the  fitted  curve 
in  order  to  obtain  the  closed  Q-circles,  whose  geometrical  parameters  determine  the  unloaded  Q  of 
the  DR.  Interpolation  of  fitted  data  makes  the  model  dependent  on  the  nature  of  the  noise,  and  leads 
to  cumbersome  changes  in  the  program  for  different  data  sets.  In  order  to  circumvent  this  difficulty, 
we  have  now  begun  to  implement  the  LSM  algorithm  in  two  steps.  In  the  first,  the  Lorentzian  part  of 
the  data  is  windowed  out,  and  only  out-of-the-band  “noise”  is  fitted  to  a  series  of  decaying  complex 
exponentials.  If  we  assume  that  the  sinusoidal  noise  is  contributed  by  standing  wave  modes  along 
the  cables  terminated  in  coupling  loops,  this  first  step  is  essentially  equivalent  to  making  a  “thru” 
two-port  measurement  on  the  network  analyzer  without  the  device  under  test  (DR).  Second,  we  model 
the  whole  spectrum  of  measured  data,  including  the  Lorentzian,  with  the  LSM  algorithm,  in  the  same 
manner.  De-embedding  the  resonator  measurements  from  the  total  is  accomplished  by  subtracting 
the  first  fit  from  the  second  fit,  akin  to  full  two-port  calibration  on  a  network  analyzer.  Then,  we 
anticipate  to  obtain  a  smooth  Lorentzian  with  nearly  constant  detuned  refiection  and  transmission 
coefficients.  Such  data  will  form  pure  Q-circles,  and  should  give  very  accurate  unloaded  Q.  Because 
of  the  similarity  of  this  process  to  hardware  calibration  on  a  network  analyzer,  we  term  this  two-step 
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curve-fitting  procedure  as  software  calibration  [13].  We  are  validating  this  modified  LSM  algorithm 
(Program  SoftCal  2)  with  measured  data  on  HTS  DRs,  consisting  of  1  cm^  YBCO  films  on  sapphire 
pucks.  At  the  time  of  this  reporting,  the  validation  was  still  in  progress. 

3  Electromagnetic  Approach 
3.1  Experimental  Details 

Cavities  containing  dielectric  resonators  are  very  useful  in  measuring  the  surface  resistance  of  HTS 
thin  films.  Fig.  1  shows  a  sapphire  dielectric  resonator  with  HTS  end  caps.  Sapphire  is  an  attractive 
dielectric  material  for  this  application  because  of  its  low  loss  tangent  and  moderate  dielectric  constant. 
The  resonator  can  be  either  free-standing,  or  enclosed  in  a  metallic  package  (cavity)  as  in  Fig.  1.  Energy 
is  coupled  into  and  out  of  the  cavity  through  two  coupling  loops.  By  proper  design  and  placement 
of  these  loops,  one  can  ensure  that  only  the  dominant  TEoii  mode  is  excited  within  the  resonator. 
Because  the  fields  are  well-trapped  in  the  dielectric  and  within  a  small  cylindrical  region  outside  the 
sapphire,  the  losses  in  the  resonator  system  can  be  minimized,  with  the  result  that  extremely  large 


In  order  to  measure  the  surface  resistance  of  YBCO  HTS  films  over  a  wide  frequency  range, 
cylindrical  sapphire  pucks  (resonators)  of  different  diameters  are  employed.  The  HTS  film  is  placed 
non-destructively  on  either  end  of  the  sapphire  puck,  and  the  resonant  frequency  and  loaded  Q  of 
the  DR  configuration  is  determined  from  the  measured  insertion  loss.  The  coupling  coefficients  and 
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unloaded  Q  are  obtained  by  applying  the  LSM  algorithm  to  the  measured  5-parameters.  The  coupling 
loops  are  made  using  coaxial  cables,  and  are  conected  to  50  test  ports  of  the  network  analyzer  for 
automated  measurement. 

The  fields  trapped  within  the  dielectric  (Fig.  1)  are  oscillatory  and  described  by  Bessel  functions  of 
the  first  kind,  The  evanescent  fields  in  free  space  decay  exponentially  along  the  radial  direction, 

and  are  specified  by  Bessel  functions  of  the  second  kind,  Kn{x).  These  fields  and  their  behavior  are 
analyzed  in  [6].  By  imposing  boundary  conditions  on  the  tangential  fields  at  the  dielectric  interface 
p  =  a,  one  obtains  the  transcendental  equation 

^la  ^20.  ’ 

where  and  ^2  are  radial  wavenumbers  in  the  dielectric  (p  <  a)  and  air  (p  >  a),  respectively.  In  order 
to  solve  eq.  (1)  numerically  for  the  resonant  frequency,  we  provide  initial  guesses  of  the  frequency  and 
^\a  and  calculate 


6a  =  x-y(a/i)2  -  (2a/A)2  (2) 

where  A  is  the  operating  wavelength.  The  resonant  frequencies  for  different  pucks  were  computed  from 
(1)  using  Mathcad  (see  Table  1).  The  dielectric  constant  of  sapphire  is  assumed  as  =  9.3.  These 
resonant  frequencies  are  in  good  agreement  with  measured  values.  The  DR  is  immersed  in  a  liquid 
helium  dewar  and  cooled  to  cryogenic  temperatures. 


Table  1,  Computed  Resonant  Frequencies  of  Sapphire  Dielectric  Resonators. 


Radius  a 

Length  L 

Res.  Freq.  (GHz) 

0.09” 

0.137” 

37.276 

0.095” 

0.137” 

35.888 

0.1” 

0.137” 

34.628 

0 

to 

0.137” 

30.607 

0.15” 

0.137” 

26.587 

0.25” 

0.137” 

20.364 

3.2  Field  Analysis 

The  electromagnetic  field  solutions  for  the  cylindrical  dielectric  resonator  can  be  determined  in  terms 
of  cylindrical  harmonics,  starting  from  the  Helmholtz  equation 
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with  the  wavenumber  given  by  A;  =  ojy/JIe.  The  magnetic  field  H  is  related  to  the  electric  field  E  via 
Faraday’s  law.  Enforcing  the  boundary  conditions  at  the  cylindrical  interface  leads  to  the  solutions 
for  the  fields  of  the  form  [6] 


^  =  F{^,p) 


cos{m(j)) 

sm{m<j>) 


e-jkrz 


(4) 


with  the  wavenumber  seperated  into  radial  ({)  and  axial  (kz)  components  according  to  +  k^. 

Inside  the  radius  of  the  sapphire,  E  is  a  Bessel  function,  Jm,  with  the  radial  wavenumber  given  by 
~  ^2>  where  ko  =  u^poeo,  and  Cr  is  relative  dielectric  constant  of  sapphire.  Outside  the 
sapphire,  E  is  a  modified  Bessel  function  Km-,  exhibiting  an  approximately  exponential  decrease  with 
increasing  p,  with  the  radial  wavenumber  given  by  Thus,  the  two  radial  wavenumbers 

are  related  by 


=  klier  -  1)  -  e,-  (5) 

If  we  visualize  the  resonator  as  a  cylindrical  waveguide  terminated  at  a  length  equal  to  an  exact 
multiple  of  the  axial  wavelength,  then  the  length  is  constrained  to  be  X  =  pi^jkz  where  p  is  a  positive 
integer.  Thus,  the  condition  for  becomes 


-  (P7r/i)^  (6) 

Enforcing  matching  of  the  fields  at  p  =  a  (the  sapphire  radius)  using  these  wavenumber  constraints 
leads  to  the  resonant  condition  (1)  for  the  axisymmetric  mode  given  by  m  =  0.  This  mode  of  interest, 
designated  as  TEqh,  has  no  4>  dependence,  it  spans  one  wavelength  in  the  z  direction,  with  zero  axial 
electric  field.  The  TEqh  mode  is  supported  by  currents  moving  in  a  circular  pattern  in  the  endplates. 

Knowledge  of  the  resonant  frequencies  allows  us  to  calculate  and  plot  the  radial  field  distribution 
outside  the  resonator.  We  have  calculated  the  fields  for  p  >  a  in  the  DR  of  Fig.  1  using  the  expressions 
for  the  TEon  mode  derived  in  [8].  It  was  observed  that  the  diameter /length  aspect  ratio  of  the  DR 
needs  to  be  large  to  contain  the  stray  fields.  Field  containment  is  a  critical  issue  in  our  application, 
because  the  resonator  used  at  WL/MLPO  is  an  open  structure  and  the  films  are  small  in  size  (about  10 
mm^),  thus  enhancing  the  possibility  of  radiation  leakage.  In  other  words,  the  films  are  not  sufficiently 


large  to  ensure  a  small  field  amplitude  near  their  edges,  and  the  presence  of  coupling  loops  nearby 
complicates  the  analysis.  The  seperable  cylindrical  harmonics  are  not  adequate  for  analyzing  such  a 
structure,  especially  considering  that  the  package  also  has  slots  through  which  the  field  can  radiate. 
The  exact  field  analysis  pertaining  to  the  interaction  of  the  coupling  loops  with  the  resonator,  which 
accounts  for  diffraction  at  the  edges  of  the  film  and  the  package  geometry,  is  formidable.  However, 
numerical  methods  can  be  employed  for  a  simplified  geometry  to  facilitate  such  analysis.  It  was 
beyond  the  scope  of  the  project  to  implement  such  numerical  methods.  However,  the  analysis  based 
on  cylindrical  harmonics  does  provide  some  insight  into  the  fixture  design,  and  is  discussed  next. 

The  key  measurables  for  the  resonator  are  the  resonant  frequency  a;o  and  the  quality  factor  Q .  The 
Q  value  attributable  to  losses  in  the  endplate  currents  is  given  by  Qc  —  woWo/Poc,  where  Wq  is  the 
total  energy  stored  in  the  resonator  and  Pqc  is  the  power  dissipation  of  the  currents  in  the  endplates. 
The  total  energy  is  (see  [8]) 


Wo  =  ~J\E4,i\Uv+"-^J\  E^2  I"  dv 
and  the  power  dissipation  due  to  endplate  current  losses  is 


(7) 


Poc  =  Rs 


j  \Uv^  J  \H,2 


dv 


(8) 


In  the  equations  above,  the  integrals  with  subscripted  1  fields  are  over  the  volume  internal  to  the  DR 
{p  <  a),  while  those  with  subscripted  2  fields  are  over  the  volume  external  to  the  DR  {p  >  a).  The 
relationship  between  Qc  and  the  average  surface  resistance  Rs  of  the  endplates  is 


Qc  = 


2407r^er 

Rs 


{2irkLf 


l  +  R 
1  T  €,.P 


(9) 


where  R,  the  ratio  of  electric  energy  stored  outside  the  sapphire  DR  to  that  inside  the  sapphire,  is 
given  by 


1  /  I  P^2  P  dv 

€r  f  I  p  dv' 


(10) 


The  factor  R  indicates  the  level  of  field  confinement  within  the  sapphire.  Lower  values  of  R  imply 
better  field  trapping  inside  the  DR,  and  enhance  the  quality  factor  Qc-  Another  useful  quantity  in 
the  design  of  a  DR  is  the  ratio  of  energy  stored  in  the  evanescent  field  outside  a  given  radius  (greater 
than  a)  to  the  total  stored  energy,  denoted  ER(p).  It  can  be  evaluated  at  different  radii  of  interest. 
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For  instance,  a  resonator  designed  to  test  endplates  with  a  radius  p  =  5mm  must  have  a  sufficiently 
low  value  of  jBiE(5mm).  Some  implications  of  endplate  size  and  shape  are  discussed  by  Mourachkine 
[15].  It  is  interesting  to  note  that  the  energy  ratio  does  not  consider  any  package  losses,  such  as  power 
dissipated  in  the  lateral  cylindrical  cavity  walls,  leakage  through  slots,  interaction  with  higher-order 
cavity  modes,  etc.  ER  does  account  for  all  these  parasitic  effects.  Ideally,  one  should  locate  the 
package  lateral  wall  at  a  radius  where  ER  is  sufficiently  small  that  the  loading  of  these  parasitics  can 
be  neglected. 

In  order  to  test  Icm^  area  films,  it  is  necessary  to  miniaturize  the  sapphire  cylinder  accordingly,  to 
sizes  that  imply  resonant  frequencies  in  the  range  of  20-40  GHz.  The  minimum  spacing  L  is  limited 
by  the  requirement  of  inserting  coupling  loops  between  the  endplates,  from  the  sides,  with  satisfactory 
clearance.  For  a  given  resonator  height  L,  the  radius  a  can  be  chosen  for  optimum  field  confinement  as 
gauged  by  the  radial  wavenumbers  and  the  two  energy  ratios  embodied  in  R  and  ER.  The  example  of 
L  =  2.4mm  is  considered  in  Fig.  2,  where  the  critical  parameters  of  the  DR  are  displayed.  The  degree 
of  field  confinement  varies  as  the  cylinder  geometry  (choice  of  L  and  a)  is  changed.  While  R  decreases 
with  increasing  radius  (for  a  given  L),  it  is  found  that  the  energy  stored  in  the  evanescent  tail  outside 
the  sapphire  has  an  optimum  around  a  —  2mm,  and  increases  with  the  endplate  radius  beyond  the 
optimum.  The  longer  the  tail,  the  shorter  is  the  energy  ratio,  ER,  and  hence,  more  desirable.  The 
optimum  ER  appears  to  favor  miniature  squat  resonators. 


1  1.5  2  2.5  3  3^ 

sapphire  radius  a  (mm) 

Figure  2:  Variation  of  the  resonant  frequency  (in  GHz),  radial  wavenumbers  and  ^2,  ratio  of  energy 
outside-to-inside  the  sapphire,  R,  and  energy  ratio,  ER,  at  5mm  and  10mm  radii,  as  functions  of 
cylinder  radius  a,  for  cylinder  height  L  =  2.4mm. 
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4  Parameter  Extraction 


Two  methods  have  been  in  use  to  extract  the  unloaded  Q-factor  of  the  dielectric  resonator,  namely, 
Ginzton  method  [5]  and  Kobayashi’s  method  [7].  Both  of  these  methods  are  applicable  to  the  processing 
of  S-parameters  measured  by  the  microwave  network  analyzer,  and  will  be  briefly  discussed  next.  The 
limitations  of  these  two  methods  wiU  also  be  presented. 


4.1  Ginzton  Method 

For  a  resonant  cavity,  the  magnitude  of  the  insertion  loss  (521  expressed  in  dB)  follows  the  peaked 
behavior  shown  in  Fig.  3.  Ginzton’s  method  [5]  entails  the  observation  of  the  resonant  frequency,  fi, 
and  A/,  the  spread  between  half-power  (3  dB)  points,  to  determine  the  loaded  Q-factor,  Qi,  as 

Qr.  =  (11) 


Figure  3:  Resonant  curve  measurement  of  the  loaded  Q-factor  of  a  dielectric  resonator. 

Ginzton’s  method  employs  measured  data  from  only  two  frequencies,  and  thus,  suffers  from  the 
following  limitations.  First,  the  unloaded  Q-factor  cannot  be  calculated  because  the  magnitude  re¬ 
sponse  lacks  information  on  the  coupling  coefficients,  which  determine  the  proportion  of  source  power 
actually  coupled  into  the  resonator  at  each  port.  Second,  if  the  data  is  either  asymmetric  around 
the  peak  or  corrupted  by  measurement  noise,  an  extraction  procedure  based  only  on  magnitudes  may 
yield  very  unreliable  results.  The  phase  of  the  measured  S-parameters  becomes  important  in  these 
situations. 


4.2  Kobayashi  Method 

Kobayashi  method  [7]  also  employs  an  HTS  dielectric  resonator  operating  in  the  TEqh  mode  to 
determine  the  surface  resistance  of  HTS  films.  The  extraction  procedure  in  Kobayashi’s  method 
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improves  upon  the  Ginzton  method  by  providing  the  coupling  coefficients,  from  which  the  unloaded 
Q-factor  may  be  determined.  Essentially,  the  loaded  Q-factor  is  stiU  computed  from  the  resonant 
peak  and  the  two  3  dB  points,  as  in  Ginzton’s  method  (see  (11)).  Kobayashi,  however,  assumes  that 
the  input  and  output  coupling  coefficients  are  equal,  and  determines  the  unloaded  Q-factor  from  the 
insertion  loss,  T,  at  the  resonant  frequency  fi  (see  Fig.  2): 

«o  =  I%.  r  =  |5..l  =  |sd  =  5|^  (12) 

where  Kc  is  the  coupling  coefficient  at  either  port.  We  have  found  that  Kobayashi’s  method  requires 
moderate  coupling  for  accurate  prediction  of  the  unloaded  Q.  It  is  difficult  to  ensure  that  the  loops 
are  always  correctly  positioned  for  equal  coupling,  especially  with  the  small  resonator  fixtures  that  we 
employ  at  higher  frequencies.  Since  Kobayashi’s  method  is  also  based  on  magnitude  measurements,  it 
suffers  from  the  same  limitations  as  Ginzton’s  method,  namely  that  the  Q  can  be  adversely  affected 
by  a  few  errant  or  inconsistent  points  in  the  sweep. 


4.3  Equivalent  Circuit  Modeling 

A  resonator,  in  principle,  has  many  modes  with  different  resonant  frequencies.  However,  if  attention 
is  focused  on  the  dominant  mode,  which  is  the  only  one  typically  excited,  the  dielectric  resonator  can 
be  conveniently  represented  by  a  parallel  tuned  circuit  [9].  Thus,  microwave  circuit  theory  can  be 
employed  to  formulate  a  robust  extraction  alogorithm  for  the  determination  of  the  unloaded  Q-factor. 
Unlike  the  existing  methods  of  determining  unloaded  Q  from  dielectric  resonator  measurements  [8], 
such  an  algorithm  would  utilize  both  magnitude  and  phase  of  the  measured  S-parameters. 

An  equivalent  circuit  of  the  dielectric  resonator  configuration  of  Fig.  1,  including  the  coupling 
loops,  is  shown  in  Fig.  4.  The  resonator  is  completely  specified  by  the  resonant  frequency  wq,  the 
unloaded  quality  factor  Qo,  and  the  conductance  Go  (or  the  resistance  Ro).  The  input  and  output 
coupling  loops  are  each  modeled  by  a  series  resistance  Rg  and  reactance  Xg-  The  series  resistance 
accounts  for  the  power  dissipated  in  the  coupling  loop.  The  series  reactance  includes  the  reactance 
of  the  loop,  and  encompasses  the  influence  of  all  higher-order  evanescent  modes  with  distant  resonant 
frequencies.  This  influence  is  usually  negligible.  Therefore,  the  equivalent  circuit  is  valid  only  near 
the  first  (fundamental)  resonance.  The  analyzer  is  connected  to  the  loops  by  means  of  two  lossy 
transmission  lines  with  characteristic  impedance  Re-  For  modeling  purposes,  these  lines  are  assumed 
to  have  lengths  G  and  ^2?  respectively,  at  the  input  and  output  ends.  In  practice,  these  lengths 
cannot  be  determined  with  any  reasonable  certainty,  and  hence,  it  becomes  important  to  estimate  the 
attenuation  and  phase  shift  for  a  given  cable. 

The  unloaded  admittance  of  the  resonator  is  calculated  as 
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Figure  4:  Equivalent  circuit  of  the  dielectric  resonator  with  loop  coupling. 


5"o  = [l  +  jQo2^^^^1  (13) 

ito  L  Wo  J 

where  u  is  the  operating  frequency  and  ojq  is  the  unloaded  resonant  frequency.  This  complex  admit¬ 
tance  does  not  consider  the  external  loading  of  the  coupling  loops  and  the  connecting  transmission 
lines.  The  complex  loaded  admittance  of  the  resonator  is  given  by 


Kk  =  4  +  kI,  k  =  1,2, 


(16) 


RcRo 

(R.  +  R.i.)^  +  XV 


RskRo 


Notice  that  superscript  I  denotes  coupling  associated  with  the  transmission  line,  whereas  superscript 
c  denotes  that  caused  by  the  loop.  Physically,  each  coupling  coefficient  equals  the  ratio  of  power 
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dissipated  in  the  external  component  (transmission  line  or  loop)  to  power  dissipated  in  the  resonator. 
Using  standard  circuit  theory  [9],  the  input  impedance  at  each  port  can  be  calculated  as 


^ek  =  Y~  =  Rc-\-  Rsk  +  Xski  k  =  1,2. 


The  port  reflection  coefficients  are  then  given  by 


Skk  =  Vk^  Tdk  + - ^ - ^ - 

l  +  Ki+K2l+jQL2^- 


with  the  unloaded  and  loaded  Q  factors  related  by 


Qo  —  Qz,(l  +  Avi  +  Ka)-  (21) 

In  the  limit  as  the  resonator  is  detuned  to  an  extremum  on  either  side  of  ioi,  it  is  evident  from  (20) 
that  the  reflection  coefficient  approaches  a  value  Tm  given  by 


^dk  = 


Rsk  "I"  j^sk  Rc 
Rsk  +  j^sk  +  Rc 


The  transmission  coefficient  also  can  be  derived  by  appealing  to  circuit  theory,  and  is  given  by 


‘5'21  =  ‘S'l2  = 


2Jk'4 


1  +  Kl  +  K2  1  +  JQl^- 


The  phase  angles  7^  and  (f)  are  functions  of  loop  parameters  R^  and  and  are  given  by 


7jt  =  2  arctan 


Rc  +  Rks 


^2s 

(b  =  arctan  — - - - 1-  arctan  — - - — . 

Rc  +  Ris  Rc  +  R2S 
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4.4  Q  Circles 


As  the  frequency  deviates  from  uq,  the  reflection  and  transmission  coefficients  describe  circles  in  the 
complex  plane,  known  as  Q  circles  [9].  The  unloaded  Q-factor  can  be  accurately  computed  from  the 
center  and  diameter  of  the  Q-circle.  As  an  example,  Fig.  5  shows  the  Q  circle  for  5ii,  plotted  from 
(20),  for  the  case: 


^  =  2,  Qo-  1000, 

ILc 

—no  ^  ^  -^sl  _  c  '^*2  ,  c 

-  0-4,  =  0.5,  —  =  1.5, 

■lXjq 

l3oii  =  36  deg.,  /?o^2  =  40  deg.,  /o  =  1  GHz, 


where  Pq  is  the  free  space  wavenumber  at  /q.  The  circle  is  obtained  by  plotting  over  a  frequency 

band  of  ±/o(3/Qo)  around  the  unloaded  resonant  frequency.  For  lossy  coupling  loops,  the  energy 

coupled  into  the  resonator  is  reduced  by  the  dissipation  in  the  loops,  with  the  result  that  the  Q-circle 

is  tangential  to  a  circle,  known  as  the  coupling  loss  circle,  at  the  detuned  point.  The  loss  circle  is 

shown  by  the  dashed  curve  in  Fig.  5.  Similar  Q-circles  and  loss  circles  can  be  drawn  for  S22  and 

5i2.  The  reader  is  referred  to  Appendix  for  further  details  on  extraction  of  the  unloaded  Q  from  the 

geometrical  attributes  of  these  circles.  S11  Q-circle _ 

i  S11  Loss  . 


-j 


Figure  5:  Q-circle  for  the  simulated  reflection  coefficient  data. 

In  summary,  the  unloaded  Q-factor  can  be  accurately  computed  from  the  center  and  diameter  of 
the  Q-circle,  which  are  related  to  the  coupling  coefficients,  hence,  to  the  circuit  element  values.  These 
circles  can  be  drawn  readily  from  the  assumed  element  values  for  simulated  resonators.  The  Q-circles 
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are  usually  not  smooth  for  measured  data  because  of  extraneous  noise  and  other  limitations  of  the 
measurement  system.  These  imperfections  cannot  be  calibrated  with  hardware.  However,  because  of 
the  physical  reasoning  that  the  measured  S-parameters  of  loop-coupled  resonators  must  circumscribe 
Q-circles  as  a  function  of  frequency  [9],  a  better  model  than  the  linear  least  squares,  which  takes  into 
account  the  transmission  line  loss  and  phase  shift  at  discontinuities,  may  be  used  to  filter  out  the  noise. 
This  modified  algorithm,  discussed  next,  consists  of  enhancing  the  convergence  of  the  least  squares 
iterations  using  Marquardt  method. 

5  Least  Squares  Marquardt  Curve  Fitting  Procedure 

The  LSM  method  for  parameter  extraction  of  the  two-port  loop-coupled  DR  is  based  on  a  similar 
method  for  the  analysis  of  one-port  measurements  [9]  and  multi-mode  resonators  [16].  Our  improved 
technique  involves  enhancing  the  expressions  for  the  reflection  and  transmission  coefficients,  given  by 
eqs.  (20)  and  (23),  respectively,  by  considering  the  complete  equivalent  circuit  including  the  coupling 
mechanism,  and  using  least  squares  minimization  to  fit  the  full  sweep  through  the  resonance.  The  fit 
functions  to  these  coefficients  are  of  the  form 


Wi  = 


ajti  -1-  02 
1  -j-  azti 


i=i 


(26) 


with  the  normalized  frequency 


ti  =  2^^.  (27) 

Jo 

Eq.  (26)  may  be  viewed  as  a  non-linear  fractional  transformation  mapping  the  normalized  frequency 
variable,  ti,  to  the  space  spanned  by  u;,.  The  complex  transformation  constants  ai,  02,  03,  the 
amplitudes  Aj  and  the  propagation  constants  7j  of  the  p  transmission  line  modes  existing  on  the 
connecting  cables  because  of  impedance  discontinuity  at  the  loop  interface,  are  to  be  determined  from 
the  set  of  i  measurements,  fi,Wi,  i  =  1,2,  •••IV,  where  w,-  denotes  theoretical  approximation  to  the 
measured  parameter  at  the  frequency  /{.  The  functional  dependence  of  these  constants  on  physical 
parameters  of  the  resonator  may  be  determined  by  comparing  the  right  hand  side  of  (20)  or  (23)  with 
that  of  (26).  For  example,  from  (23)  we  obtain 


O]  —  0,  02 


1  +  Kl  -h  K2 


03  =  JQl- 


(28) 


The  terms  in  the  summation  in  (26)  account  for  cable  losses,  multiple  reflections,  and  relative  phase 
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shifts  (dispersion)  introduced  by  spurious  discontinuities  in  the  cables  leading  to  the  resonator.  Nor¬ 
mally,  these  factors  are  calibrated  out  using  multiple  sets  of  independent  measurements.  One  such 
measurement  is  the  transmission  between  the  two  ports  without  the  resonator.  However,  this  measure¬ 
ment  is  not  reliable  for  a  loop-coupled  resonator,  because  the  loops  are  designed  to  weakly  couple  to 
each  other.  Uncalibrated  or  poorly  calibrated  data  are  better  analyzed  by  using  (26)  as  a  fit  function. 
The  inclusion  of  transmission  line  modes  renders  the  fit  non-linear. 

Since  the  measured  data  is  overdetermined,  the  problem  of  calculating  the  transformation  constants 
in  the  fit  function  may  be  cast  as  minimization  of  the  square  error  between  measurements  Wmi  and 
the  model  Wj,  defined  by 


X^(a)  =  I]  I  -  Wi  |2  . 
1=1 


(29) 


Here,  we  assume  that  iV  >  m,  where  m  is  the  number  of  independent  parameters  of  the  fit  function. 
For  notational  convenience,  these  parameters  of  (26)  are  stored  in  an  array  a  =  [ai  a2  •  • 
with  the  superscript  T  denoting  transpose.  The  factor  1/cr?  denotes  the  weighting  constant  for  the  i*^ 
sample,  and  may  be  assumed  as  unity  without  any  loss  of  generality.  When  the  parameters  are  allowed 
to  vary  from  their  initial  estimates  by  differential  increments,  Suj,  the  model  Wi  can  be  approximated 
by  first-order  Taylor’s  series  expansion  as 


Wi 


(30) 


where  Wio  and  the  derivative  are  evaluated  at  the  initial  guess  a  =  ao.  Although  the  model  is  non¬ 
linear  with  respect  to  the  parameter  vector  a,  the  Taylor’s  series  approximation  in  (30)  effectively 
linearizes  the  function,  so  that  linear  least  squares  theory  can  be  applied. 

Expressing  the  data,  Wm.i ,  and  the  function,  w,-,  in  complex  form  as  Wmi  =  +  jXmi  and  Wi  = 

’’'i  +  JXi^  respectively,  the  chi-squared  error  in  (29)  may  be  written  as 


X 


2 


(31) 


We  minimize  with  respect  to  each  of  the  parameter  increments,  Saj,  by  setting  the  parametric 
derivatives  equal  to  zero; 
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dx^  ^  1  ^  1  /  N 


Noting  from  (30)  that 


dwi  _  dwio 
d{6ak)  dak 

and  replacing  Wi  with  its  Taylor’s  series  expansion,  we  obtain  from  (32) 


(33) 


+ 

+ 


dxio  ^  1  A  drio  drio 


dak  ^  CT?  ^  dak 


dxio  dx 


io 


daj  dak 


(Saj),  k  =  1,2,  ••  -  m. 


(Sa^) 


(34) 


Eq.  (34)  can  be  concisely  expressed  in  matrix  form  as  [C][a]  =  [0\  with  the  elements  given  by 


^  1 


i=l 


dfio  drjo  ^  dxjo  dxjp 
daj  dak  daj  dak  ’ 


Oj  =  Saj,  j,k  =  1,2,- •  •m, 


(35) 

(36) 


The  matrix  [C]  and  the  vector  [/?]  can  be  completely  calculated  from  the  initial  guess  ao-  The  deriva¬ 
tives  can  be  evaluated  analytically  from  the  model  (26).  This  procedure  can  be  iterated,  with  the 
corrective  offset  of  the  parameter  vector  at  each  iteration  to  be  computed  using  [a]  =  [C]“^[/3].  If  the 
initial  guess  is  close  to  the  solution  vector,  then,  this  linearized  least  squares  implementation  suffices. 
However,  for  the  noise-corrupted  DR  measurements,  we  found  that  this  procedure  does  not  converge 
well  because  of  the  sensitivity  to  initial  guess,  and  deviations  from  an  ideal  Q  circle.  Therefore,  the 
correction  to  vector  [a]  at  each  iteration  is  implemented  using  the  Marquardt  algorithm  [17],  which 
is  formulated  to  seek  a  global  minimum  in  the  parameter  space  from  a  relatively  crude  initial  guess. 
Marquardt’s  algorithm  employs  an  interpolating  parameter.  A,  to  influence  the  direction  of  search  at 
each  iteration.  The  LSM  algorithm  is  implemented  as  follows: 


1.  Compute  X^(^o)  given  the  initial  guess. 

2.  Modify  the  diagonal  elements  of  [C]  as  C'kj  —  Ckj{l  +  A),  with  an  initial  value  of  A  =  0.001. 
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3.  Compute  the  parametric  correction,  Sa.,  from  H  =  [c]-M/?]. 

4.  If  x^(a  +  6a)  >  x^(a),  then,  set  X^ew  =  lOAow;  else,  set  X^ew  =  O-lA^/rf.  Recompute  [C]  and  the 
new  correction,  ^a. 

5.  Repeat  the  previous  step  until  the  iterations  converge,  as  indicated  by  the  weighted  variance 
changing  by  less  than  0.01  from  one  iteration  to  the  next. 


5.1  Initial  Guess  Estimate 


Two  critical  parameters  which  provide  the  initial  guess  to  the  LSM  algorithm  are  the  loaded  resonant 
frequency,  fi,  and  the  loaded  quality  factor,  Q^.  The  estimation  scheme  for  fi  is  based  on  the  fact 
that  the  magnitude  of  either  5i2  or  52i,  when  plotted  against  frequency,  exhibits  maximum  slope  in 
the  neighborhood  of  the  resonant  frequency.  For  the  measured  data,  it  has  been  found  that  these  two 
parameters  have  resonant  frequencies  which  are  slightly  shifted.  Therefore,  the  arithmetic  average  of 
these  two  parameters  is  examined  for  maximum  slope  of  magnitude  against  frequency.  Specifically,  the 


derivative  |  dwi/ df  \  is  calculated  numerically  using  the  central  difference  approximation  on  neighboring 


frequency  points  (except  at  the  end  points  where  either  forward  or  backward  difference  is  employed), 
and  the  values  are  arranged  in  descending  order  to  detect  the  resonant  frequency,  fi-  The  detected 
value  is  confirmed  by  plotting  the  derivative  against  frequency.  The  unloaded  resonant  frequency  fo 
is  approximated  as  fi  in  calculating  t,-  as  per  (27).  The  loaded  Q-factor  may  be  estimated  from  the 
raw  data  using  [18] 


Ql 


(37) 


‘^L  I  dwi/du>  I 

2Re{wi) 

where  wi  represents  the  measured  parameter.  The  form  in  (37)  is  convenient  because  the  loaded 
resonant  frequency  and  the  derivative  estimated  in  the  previous  step  can  be  used  to  evaluate  (37)  at 
several  frequencies  in  the  resonant  band.  An  average  of  aU  these  closely  spaced  values  is  assumed  as 
the  best  estimate  of  Qi. 


As  an  example  of  how  these  estimates  of  fi  and  Qi  can  be  used  in  the  LSM  algorithm,  consider 
the  fit  function  in  (26)  with  only  one  exponential,  whose  amplitude  is  normalized  to  unity.  Then,  one 
needs  initial  guesses  for  oi,  02,  ^3  and  71.  Clearly,  02  can  be  set  to  the  value  of  the  function,  Wi,  at 
fi  =  fh-  From  either  (20)  or  (23),  it  follows  that  =  jQi,.  We  have  found  that  the  convergence  of 
the  algorithm  is  not  sensitive  to  the  estimate  of  71.  Therefore,  we  start  with  an  estimate  of  71  =  0. 
For  the  transmission  parameters,  the  model  implies  ai  =  0  for  aU  iterations  (see  (28)),  while  it  is  set 
equal  to  03  initially  for  the  reflection  parameters. 
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6  Examples  of  Curve  Fitting 


We  have  determined  the  unloaded  Q  of  the  same  resonator  as  specified  in  Sec.  4.4,  with  the  raw 
data  for  curve-fitting  obtained  by  shifting  the  reference  planes  in  eqs.  (20)  and  (23)  in  accordance 
with  a  known  length  of  the  input  and  output  lines.  Curve-fitting  of  the  simulated  data  provides  an 
intuitive  validation  of  the  computer  program  to  implement  the  LSM  algorithm.  In  order  to  make  the 
validation  over  a  wide  band,  simulated  data  lying  within  a  range  of  ±10(/o/Qo))  and  corrupted  by 
random  Gaussian  noise,  is  input  to  the  LSM  program.  The  lines  are  assumed  to  be  lossless,  and  one 
exponential  is  used  for  the  standing  wave  mode  on  each  line.  Three  iterations  were  used  to  correct  the 
parameters.  From  a  knowledge  of  the  geometrical  attributes  of  the  fitted  Q-circle,  we  have  calculated 
Ki  =  1.39012,  K2  —  0.65508,  and  Ql  =  326.579,  which  yield  an  unloaded  Q  of  994.5  (within  0.5%  of 
the  specified  Qo  =  1000).  The  phase  shifts  to  compensate  for  the  line  lengths  are  estimated  to  be 
Poh  =  36.012  deg.,  /3o^2  =  40.013  deg.,  yielding  a  fitting  error  of  =  0.001. 


Figure  6;  Comparison  of  the  Least  Squares  Marquardt  (LSM)  curve-fit  result  with  measured  data. 
The  fit  functions  for  Sll  and  S12  are  (16)  and  (17),  respectively. 

An  example  of  the  improved  results  obtained  by  using  the  LSM  method  on  measured  data  is  shown 
in  Fig.  6,  for  a  dielectric  resonator  formed  using  copper  endplates  with  a  12mm  radius.  The  resonant 
frequency  is  26.45  GHz.  Using  the  LSM  algorithm,  the  calculated  Qo  is  3710,  with  coupling  coefficients 
Ki  =  0.532  and  K2  —  0.219,  and  x^  =  0.02,  as  evidenced  by  the  good  match  in  both  magnitude  and 
phase  (Fig.  6).  Kobayashi’s  method  gives  Qo  —  4336  for  the  same  dataset,  while  Shen’s  method  gives 
Qo  =  4206.  The  discrepancies  are  not  surprising,  given  that  the  raw  measurements  do  not  trace  out 
perfect  circles  in  a  polar  representation. 
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7  Conclusions 


An  efficient  algorithm,  based  on  least  squares  minimization  of  the  square  error  between  an  assumed 
fractional  non-linear  transformation  and  the  measured  data,  has  been  developed  for  the  extraction 
of  unloaded  Q  from  dielectric  resonator  measurements.  The  convergence  of  the  alogorithm  has  been 
enhanced  using  the  Marquardt  method.  The  circuit  equivalent  of  the  primary  DR  mode  has  been 
employed  to  develop  idealized  expressions  for  the  resonator  response,  which  trace  out  circles  in  the 
complex  plane.  Starting  with  the  approach  of  matching  Q-circles  to  the  resonator  data,  we  have 
augmented  these  expressions  to  compensate  for  the  undesirable  influence  of  the  coupling  feed  structure. 
The  resulting  improved  method  for  analyzing  two-port  resonators  is  useful  when  calibration  is  difficult. 
This  procedure  is  more  reliable  and  accurate  than  previous  methods  based  on  three-point  resonant 
curve  measurements  (e.gf.,  Ginzton’s  and  Kobayashi’s  methods),  because  both  magnitude  and  phase  of 
a  wide  data  sweep  around  the  resonant  frequency  are  used  to  fit  the  measured  data.  Poorly  calibrated 
data  with  a  few  errant  points  in  the  sweep  can  be  analyzed  using  the  LSM  method.  The  extraction 
program  has  been  applied  to  compute  the  unloaded  Q  of  dielectric  resonators  consisting  of  copper  end 
plates.  A  simulated  example  with  additive  random  noise  has  been  presented  to  validate  the  algorithm 
against  a  specified  DR. 


Appendix 

Determination  of  Coupling  Coefficients 


With  reference  to  Fig.  5,  let  dn  and  ^22  denote  diameter  of  the  Q-circle  for  input  and  output 
reflection  coefficients,  respectively,  while  that  of  the  corresponding  coupling  loss  circle  is  denoted  by 
die  and  d2c,  respectively.  The  diameter  of  the  transmission  Q-circle  is  di2.  These  diameters  are 
obtained  from  the  corresponding  transformation  vector  a  generated  by  the  fitted  curves,  following  the 
procedure  discussed  in  [9].  The  diameter  of  the  loss  circle  is  computed  as 


,  4fc[l  -  iFdfcP] 

dkk-{dkk/2y-\rdk\^  +  \Tck\^' 


k  =  1,2 


(38) 


where  FcA;  denotes  the  center  of  the  corresponding  reflection  Q-circle.  The  various  coupling  coefficients 
are  calculated  as  (see  eqs.  (16)  and  (17)) 


K 


l 

1 


1  —  dll 


dii/2 _ 

d-}-{di2ldiifd-^} 


(39) 

(40) 
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(41) 


kI  =  ni 


k  =  1,2. 


The  unloaded  Q  factor  follows  from  these  coupling  coefficients  and  the  loaded  Q,  as  per  (21). 
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MODELING  OF  INITIATION  AND  PROPAGATION  OF  DETONATION  IN  ENERGETIC  SOLIDS 


Joseph  M.  Powers 
Associate  Professor 

Department  of  Aerospace  and  Mechanical  Engineering 
University  of  Notre  Dame 

Abstract 


Results  of  a  study  of  the  initiation  and  propagation  of  detonation  in  energetic  solids  are  reported 
on  here.  The  study  has  focused  on  two  areas: 

•  shear  band  formation  and  reaction  initiation  in  a  thin-shelled  cylinder  rotating  under  an  applied 
torque 

•  transition  to  detonation  in  granular  solid  explosives 

The  shear  band  analysis  employs  a  simple  model  to  predict  conditions  under  which  a  global  input  of 
mechanical  energy  localizes  in  space  and  time  in  a  manner  sufficient  to  initiate  significant  chemical 
reaction.  The  second  portion  of  the  study  focuses  on  characterization  of  the  transition  to  detonation 
in  granular  materials  and  on  conditions  necessary  for  different  classes  of  such  detonations. 
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1.  Introduction 


This  report  is  concerned  with  two  distinct  approaches  to  studying  the  initiation  of  detonation 
in  solid  explosives:  reactive  shear  band  analysis  and  multiphase  mixture  theory.  Consequently,  the 
report  is  divided  to  treat  each  approach  individually.  The  reactive  shear  band  study  is  described  in 
full  detail  by  Caspar,  1996,  and  Powers,  Caspar,  and  Mason,  1997;  the  multiphase  mixture  study 
is  described  in  full  detail  by  Gonthier,  1996,  Gonthier  and  Powers,  1996,  and  Gonthier  and  Powers, 
1997.  This  report  will  summarize  some  of  the  key  findings  of  these  works.  It  is  also  noted  that 
a  third  aspect  of  the  work  proposed,  modeling  of  reactive  Taylor  impact  with  the  EPIC  code,  has 
been  delayed  for  mainly  administrative  reasons:  the  author  has  initiated  within  the  University  the 
necessary  steps  to  become  a  member  of  the  proper  govemment-University  consortium  needed  to 
obtain  access  to  the  EPIC  code.  As  such  no  technical  results  for  the  Taylor  impact  problem  are 
reported  here. 

2.  Reactive  shear  band  formation 

2.1  Introduction 

Motivated  by  the  long  term  goal  of  developing  munitions  which  are  insensitive  to  accidental 
initiation  and  the  short  term  goal  of  understanding  shear  banding  in  reactive  materials,  this  paper 
considers  the  behavior  of  energetic  and  inert  solids  subjected  to  simple  shear  loading.  Data  from  a 
torsional  split-Hopkinson  bar  (TSHB),  built  for  this  study,  was  reduced  to  determine  shear  stress 
and  shear  strain  characteristics  of  these  materials.  These  results  were  then  used  to  calibrate  a 
constitutive  law  for  stress,  including  the  effects  of  strain  and  strain  rate  hardening  and  thermal 
softening.  A  one  dimensional  finite  difference  study  of  shear  localization  was  performed.  The  effects 
of  thermal  conductivity,  viscoplastic  heating  and  Arrhenius  kinetics  were  modeled.  Results  revealed 
shear  localization  and  reaction  initiation  in  the  explosives  simulated.  Experimental  failure  of  the  inert 
solids,  however,  occurred  at  shear  strains  significantly  lower  than  those  predicted  by  theory.  This 
has  been  attributed  to  the  presence  of  failure  mechanisms  other  than  macroscale  shear  localization, 
which  were  not  included  in  the  theoretical  model.  While  the  tested  energetic  materials  did  not 
undergo  macroscale  shear  localization  or  initiation  under  the  conditions  considered,  the  study  has 
may  have  intrinsic  value  for  less  brittle  materials  which  may  undergo  macroscale  shear  localization 
or  even  for  brittle  materials,  which  could  shear  loccdize  on  a  microscale.  Some  specifics  follow;  a  full 
literature  review  and  detailed  discussion  is  given  by  Caspar,  1996 

Initiation  of  reaction  in  energetic  solids  due  to  mechanical  insult  is  an  important,  yet  poorly 
understood  mechanism.  In  a  typical  event,  a  sharp  blow  will  result  in  an  input  of  mechanical 
energy  into  the  solid  which  will  initially  manifest  itself  in  the  form  of  internally  propagating  stress 
waves.  These  waves  will  interact  with  themselves,  material  interfaces,  and  boundaries,  all  the  time 
dissipating  mechanical  energy  into  thermal  energy.  Should  the  dissipation  rate  be  sufficiently  high 
and  geometrically  concentrated,  it  may  be  possible  to  initiate  a  temperature-sensitive  exothermic 
chemical  reaction,  which  can  ultimately  lead  to  detonation  in  the  solid. 

It  is  clear  that  in  order  to  understjuid  this  process,  it  is  imperative  to  have  accurate  constitutive 
equations.  Additionally,  before  full  scale  implementation  in  large  scale  design  codes,  it  can  be  ben¬ 
eficial  to  test  the  constitutive  equation  in  a  much  simpler  code.  With  such  a  model  one  can  quickly 
and  unambiguously  focus  on  the  performance  of  the  constitutive  equation  in  a  simple  computa¬ 
tional  environment  which  contains  the  key  modeling  ingredients:  non-linear,  experimentally  verified 
stress-strain-strain  rate  relations,  finite  rate  exothermic  temperature-sensitive  chemical  kinetics,  and 
thermal  diffusion. 

2.2  Experimental  Apparatus 

In  this  study  the  high  strain  rate  constitutive  behavior  of  explosive  simulants  has  been  deter¬ 
mined  through  the  use  of  an  experimental  apparatus  known  as  the  torsional  split  Hopkinson  bar 
(TSHB),  constructed  specifically  for  this  study,  see  Figure  1,  This  apparatus  is  capable  of  deforming 
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Figure  1:  Photograph  of  the  torsional  split  Hopkinson  bar  (TSHB)  used  in  this 
research. 

materials  in  simple  shear  at  shear  strain  rates  of  10^  to  10‘‘  The  TSHB  has  previously  been  used 
to  determine  material  characteristics  for  metals,  in  which  failure  often  occurs  due  to  a  mechanism 
known  as  shear  localization.  Shear  localization  is  one  of  the  least  understood  initiation  mechanisms 
in  solid  explosives.  If  shear  localization  were  to  occur,  it  would  be  very  likely  to  appear  in  the  TSHB 
configuration  studied  here. 

Figure  2  describes  the  mechanism  of  shear  localization.  In  Figure  2a,  a  portion  of  an  undeformed 
material  is  sketched  with  thin  lines  inscribed  on  its  surface.  When  this  material  is  sheared,  the  scribe 
lines  begin  to  slant  at  a  uniform  angle,  as  seen  in  Figure  2b,  reflecting  what  is  known  as  homogeneous 
deformation.  Increased  straining  into  the  plastic  range  results  in  material  hardening.  In  addition,  if 


Figure  2:  Schematic  of  the  shear  localization  process,  (a)  Undeformed  grid  lines, 
(b)  Homogeneous  deformation,  (c)  Shear  localization 


there  is  a  geometric  discontinuity  or  other  material  weakness,  straining  near  that  discontinuity  will 
occur  at  a  higher  strain  rate,  which  also  hardens  the  matericJ.  This  increased  local  deformation, 
however,  also  causes  plastic  heating  of  the  material.  If  the  straining  occurs  at  high  strain  rates 
(typically  greater  than  10^  5“^),  there  is  not  enough  time  for  the  generated  heat  to  be  conducted 
away.  The  local  increase  in  heat  results  in  thermal  softening  of  the  material.  If  this  process  dominates 
over  the  hardening  due  to  strain  and  strain  rate  effects,  the  material  strength  decreases.  As  a  result 
of  this  local  softening  of  the  material,  deformation  is  localized  into  a  thin  planar  region,  as  depicted 
by  the  scribe  line  deformation  of  Figure  2c.  This  final  process  is  known  as  shear  localization  or  shear 
banding.  Due  to  the  potential  concentration  of  thermal  energy  in  a  shear  band,  it  is  hypothesized 
that  this  could  trigger  a  reaction,  which  could  spread  through  the  material. 

2.3  Analytic  Model 

A  simple  model  for  a  thin  walled  cylindrical  incompressible  reactive  material  undergoing  simple 
torsional  shear  was  developed.  The  governing  equations  utilized  are 
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Here  t  is  time,  z  is  the  axial  distance,  vg  is  the  velocity  in  the  circumferential  direction,  r  is  the  shear 
stress,  e  is  the  internal  energy,  is  the  heat  flux  in  the  axial  direction,  ug  is  the  displacement  in  the 
circumferential  direction,  7  is  the  shear  strain,  A  is  the  reaction  progress  variable  (0  <  A  <  1),  and 
T  is  the  temperature.  The  parameters  Z,  E,  and  R  are,  respectively,  the  kinetic  rate  constant,  the 
reaction  activation  energy,  and  the  universal  gas  constant.  Also  a  is  a  stress  constant;  subscripts 
A  and  B  refer  to  the  unreacted  and  reacted  material,  respectively;  ca  and  eg  are  the  internal 
energies;  ttia  and  tub  are  the  mass  fractions;  ca  and  cb  are  the  specific  heats;  and  and  6% 

are  the  energies  of  formation.  Equation  (1)  models  the  conservation  of  linear  momentum  in  the 
circumferential  direction.  Equation  (2)  models  the  conservation  of  energy.  Equation  (3)  is  the 
definition  of  strain.  Equation  (4)  defines  velocity  as  the  time  derivative  of  displacement.  Finally, 
Equation  (5)  is  an  Arrhenius  kinetics  law.  Equation  (6)  is  a  constitutive  law  for  stress,  where  u, 
T],  and  n  are  the  exponents  which  characterize  the  thermal  softening,  the  strain  and  strain  rate 
hardening,  respectively.  These  coefficients  were  chosen  to  fit  experimental  data  found  as  part  of  this 
study.  Equation  (7)  is  Fourier’s  law  of  heat  conduction.  Equation  (8)  is  a  mixture  law.  Equations  (9) 
and  (10)  are  the  caloric  state  equations.  Lastly,  Equations  (11)  and  (12)  define  the  mass  fractions 
in  terms  of  the  reaction  progress. 

These  equations  are  supplemented  by  appropriate  initial  and  boundary  conditions  and  then  cast 
in  dimensionless  form.  It  can  be  shown  formally  shown  that  the  equations  are  parabolic  and  thus 
suitable  for  solution  via  a  time-marching  technique.  They  are  solved  numerically  by  first  replacing 
all  terms  involving  spatial  derivatives  with  second  order  accurate  finite  difference  approximations. 
The  resulting  system  of  equations  is  a  set  of  N  non-linear  ordinary  differential  equations  in  time, 
where  N  is  related  to  the  user  chosen  fineness  of  the  finite  difference  grid.  These  equations  are  then 
integrated  implicitly  using  the  standard  package,  LSODE,  to  generate  time  dependent  solutions  at 
each  grid  point.  The  code  has  been  verified  on  a  number  of  test  problems  with  known  exact  solutions; 
grid  refinement  studies  verify  that  the  numerical  approximations  converge  to  the  exact  solutions  at 
a  rate  roughly  proportional  to  the  square  of  the  spatial  grid  size. 

2.4  Results 

Figure  3  compares  the  experimental  and  numerical  shear  stress  and  shear  strain  characteristics 
for  an  inert  simulant  of  the  pressed  explosive  PBX.  From  this  figure,  it  is  seen  that  the  model  predicts 
the  shear  stress  and  shear  strain  characteristics  fairly  accurately  until  just  before  failure.  The  code, 
however,  does  not  predict  localization  to  begin  until  a  nominal  shear  strain  of  4.63  is  reached,  as 
compared  with  the  experimentally  observed  brittle  failure  at  0.20  shear  strain.  So,  the  PBX  pressed 
simulant  does  not  fail  due  to  shear  localization. 
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Figure  3:  A  comparison  of  the  experimental  and  numerical  results  for  the  PBX 
pressed  simulant. 


Figure  4,  which  plots  the  theoretical  predictions  of  time-evolution  of  the  spatial  temperature 
distribution  for  a  reactive  material,  PBXN-109.  The  effects  of  including  reaction  proved  to  have 


Figure  4:  Evolution  of  the  temperature  field  for  PBX  9501  with  reaction. 


little  effect  on  the  results  prior  to  initiation,  when  compared  to  a  simulation  in  which  reaction  was 
neglected.  As  was  anticipated  by  the  nonreactive  case,  reaction  in  the  reactive  test  did  occur  shortly 
following  the  onset  of  localization.  It  was  predicted  that  appreciable  reaction  did  not  commence 
until  the  reaction  temperature  was  reached,  at  which  time  reaction  quickly  initiates  in  the  localized 
hot  spot.  It  is  important,  however,  to  state  that  the  nominal  shear  strain  reached  at  initiation  is 
approximately  6.4,  whereas  the  simulant  failed  after  a  shear  strain  of  0.2  experimentally.  While 
indeed,  this  is  a  weakness  of  the  present  model  for  the  system  studied,  we  reiterate  that  a  theory 
presented  here,  when  applied  to  brittle  materials  on  a  microscale,  or  to  more  ductile  materials,  may 
through  blending  simple  chemistry  and  mechanics,  may  have  great  promise  in  gaining  understanding 
of  initiation  of  reaction  in  complex  materials. 

3.  Transition  to  detonation  in  granular  solid  explosives 


3.1  Introduction 


Considerable  research  has  been  conducted  during  the  past  three  decades  addressing  the  evo¬ 
lution  of  detonation  in  granulated  energetic  material.  This  research  has  largely  been  motivated  by 
concerns  over  the  accidental  detonation  of  damaged  high  explosives  or  propellants  in  response  to 
weak  mechanical  shock  or  thermal  insult  (Asay  and  Hantel  1991).  Here,  damaged  material  refers 
to  cast  solid  material  which  has  been  inadvertently  fractured;  thus,  local  granulated  regions  exist 
within  the  material. 
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A  number  of  two-phase  continuum  models  have  been  formulated  for  anzJyzing  deflagration-to- 
detonation  transition  (DDT)  in  granulated  explosives  (Butler  and  Krier  1986;  Baer  and  Nunziato 
1986;  Powers  et  al.  1990).  Numerical  simulations  based  on  these  two-phase  models  have  been 
modestly  successful  in  predicting  most  experimentally  observed  features  of  DDT  including  1)  the 
formation  and  propagation  of  a  lead  compaction  wave,  2)  the  initiation  and  subsequent  acceleration 
of  a  burn  front  in  the  compacted  material,  and  3)  the  final  transition  to  detonation.  However,  little 
emphasis  has  been  given  to  an  analysis  of  fully-developed  detonation  structure.  Moreover,  many 
DDT  simulations  are  performed  using  coarse  computational  grids  which  are  incapable  of  resolving 
fine-scale  detonation  structure.  As  such,  fully-developed  detonation  structures  predicted  by  two- 
phase  DDT  simulations  are  not  well-characterized. 

The  primary  objective  of  this  study  is  to  predict  and  analyze  two-phase  detonation  structures  by 
numerically  simulating  DDT  whereby  combustion  is  induced  by  weak,  planar  mech£mical  shock  due 
to  low  velocity  piston  impact  (~  100  m/s),  and  to  compare  the  predicted,  fully-resolved  structures 
with  results  given  by  a  steady-state  detonation  analysis.  A  secondary  objective  of  this  work  is  to 
classify  new  steady  detonation  structures.  To  this  end,  we  use  a  variant  of  the  model  formulated  by 
Powers  et  al.  (1990a).  The  steady  analysis  is  a  minor  extension  of  the  work  performed  by  Powers  et 
al.  (1990b).  The  unsteady  analysis  is  an  extension  of  the  work  performed  by  Gonthier  and  Powers 
(1996). 

3.2  Mathematical  model 

The  model  assumes  the  existence  of  compressible  reactive  solid  particles  and  a  compressible 
inert  gas.  Mass,  momentum,  and  energy  exchange  between  the  gas  and  solid  are  modeled,  as  is 
dynamic  compaction  of  the  granular  bed  due  to  a  mechanical  stress  imbalance.  Diffusive  transport 
mechanisms  within  each  phase  are  ignored.  Also,  the  effects  of  lateral  boundaries  on  the  two-phase 
flow  are  not  considered;  as  such,  the  flow  is  assumed  one-dimensional  (in  a  macroscopic  sense).  The 
dimensional  model  equations  are  given  by  the  following: 
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In  these  equations,  the  subscripts  “1”  and  “2”  denote  quantities  associated  with  the  gas  and 
solid,  respectively.  Quantities  labeled  with  overhats“«”  are  dimensional,  and  quantities  labeled  with 
subscript  “o”  are  associated  with  the  ambient  state.  The  independent  variables  are  time  i  and 
position  X.  Dependent  variables  are  as  follows:  the  phase  density  pi  (i  =  1,2),  defined  as  the  mass 
of  phase  i  per  unit  volume  occupied  by  that  phase;  the  phase  pressure  Pi\  the  phcise  temperature  T); 
the  particle  velocity  Uj,  measured  with  respect  to  a  stationary  reference  frame;  the  specific  internal 
energy  Cj;  the  volume  fraction  (f>i,  defined  as  the  ratio  of  the  volume  occupied  by  phase  i  to  the 
total  volume  {<f>i  +<l>2  =  1);  the  radius  of  the  spherical  solid  particles  r;  the  number  of  particles  per 
unit  volume  n  (=  302/47rf3);  the  intragranular  stress  /;  and  an  ignition  variable  7.  In  Eqs.  (13-21), 
~  hg)  is  the  Heaviside  unit  step  function,  and  lig,  a,  m,  4,  h,  p.^,  ki,  and  fj  are  constant 
parameters.  Closure  is  achieved  by  specifying  thermal  (Pj  =  Pj(pi,T;)]  and  caloric  [ii  =  ei{pi,fi)] 
state  relations  for  each  phase,  and  by  specifying  the  functional  form  for  /. 

Equations  (1-3)  and  (4-6)  are  the  mass,  momentum,  and  energy  evolution  equations  for  the  gas 
and  solid,  respectively.  Equation  (7)  is  a  dynamic  compaction  equation,  Eq.  (8)  is  a  particle  number 
evolution  equation,  and  Eq.  (9)  is  an  evolution  equation  for  the  ignition  variable. 


3.3  Numerical  method 

^  The  non-strictly  hyperbolic  system  of  model  equations  were  solved  using  a  new  high-resolution 
upwind  numerical  method  (Gonthier  and  Powers  1997).  The  method,  which  is  based  on  Godunov’s 
approach,  does  not  require  the  explicit  use  of  artificial  viscosity,  can  accurately  capture  shocks  with 
minimal  smearing,  and  can  accurately  resolve  disparate  time  scales  associated  with  rate-dependent 
processes.  Rather  than  using  the  exact  solution  of  the  two-phcise  Riemann  problem  at  each  compu¬ 
tational  cell  boundary  to  advance  the  solution  in  time,  an  approximate  solution  is  used  for  increased 
computational  efficiency.  The  method  is  convergent,  and  the  spatial  convergence  rate  was  deter¬ 
mined  based  on  comparisons  of  numerical  predictions  with  known  theoretical  results  for  several  test 
problems.  Global  convergence  rates  of  ~  1.0  were  determined  for  problems  having  embedded  shocks, 
and  rates  of  ~  1.7  were  determined  for  problems  having  continuous  solutions. 

3.4  Results 


The  numerical  simulations  predicted  most  experimentally  observed  features  characteristic  of 
piston-initiated  DDT  in  granular  HMX,  Experimentally  observed  time  scales,  wave  speeds,  and 
pressure  magnitudes  are  correctly  predicted.  Several  classes  of  steady  two-phase  detonation  wave 
structures  were  predicted  to  evolve:  Chapman- Jouguet  ( CJ)  and  weak  detonation  structures  having 
a  lead  shock  in  the  gas  and  an  unshocked  solid,  CJ  structures  having  a  lead  shock  in  the  solid  and  an 
unshocked  gas,  and  CJ  structures  having  a  shock  in  both  the  gas  and  solid  (Gonthier  1996).  Which 
structure  evolved  was  found  to  depend  on  the  material  compaction  rate,  the  interphase  drag  rate, 
and  the  ambient  mixture  density.  The  results  indicate  that  the  CJ  wave  speed  is  not  the  unique 
wave  speed  for  a  self-propagating  two-phase  detonation.  Numerically  predicted  structures  agree  well 
with  results  given  by  the  strictly  steady-state  detonation  wave  analysis. 

Shown  in  Fig.  1  is  the  predicted  gas  velocity  history  (measured  relative  to  a  fixed  laboratory 
frame)  for  the  evolution  of  a  two-phase  weak  detonation  having  a  lead  shock  in  the  gas  and  an 
unshocked  solid.  Here,  |  is  position  measured  relative  to  a  coordinate  system  attached  to  the  piston, 
and  r  =  t.  Also  shown  in  this  figure  is  the  spatial  profile  at  f  =  120  ps.  For  this  simulation,  a 
virial  equation  of  state  was  used  for  the  gas  and  a  Tait  equation  of  state  was  used  for  the  solid.  The 
moving  piston  (located  at  ^  =  0  cm)  induces  the  formation  of  a  compaction  wave  propagating  at 
402  m/s.  Ignition  is  predicted  to  occur  near  the  piston  surface  approximately  135  ps  after  piston 
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impact;  subsequently,  there  is  predicted  a  rapid  transition  to  detonation.  The  resulting  detonation 
is  propagating  at  6168  m/s.  A  comparison  of  the  shocked  gas  -  unshocked  solid  weak  detonation 
structure  predicted  by  both  the  numerical  simulation  and  the  steady-state  analysis  is  given  in  Fig.  2. 
Good  agreement  exists  between  the  predicted  solutions. 

4.  Acknowledgments 


The  author  gratefully  acknowledges  the  efforts  of  his  former  graduate  students  Mr.  Richard  J. 
Caspar,  currently  at  Gulfstream  Aerospace,  for  his  work  on  reactive  shear  bands,  and  Dr.  Keith  A. 
Gonthier,  currently  at  Los  Alamos  National  Laboratory,  for  his  work  on  reactive  mixture  modeling. 
The  author  also  acknowledges  the  many  conversations  with  his  colleague  at  Notre  Dame,  Prof.  James 
J.  Mason,  and  at  Eglin  AFB,  Dr.  Joseph  Foster. 

5.  References 


Asay  B,  and  Hantel  L  (1991)  Major  thrust  areas  for  examination  of  deflagration-to-detonation 
transition  in  granular  and  damaged  explosives.  Los  Alamos  National  Laboratory  Report  M-8- 
91-61,  Los  Alamos,  New  Mexico 

Baer  MR,  and  Nunziato  JW  (1986)  A  two-phase  mixture  theory  for  the  deflagration-to-detonation 
transition  (DDT)  in  reactive  granular  materials.  Int  J  Multiphase  Flow  12:861-889 

Butler  PB,  and  Krier  H  (1986)  Analysis  of  deflagration-to-detonation  transition  in  high-energy 
solid  propel-  lants.  Combust  and  Flame  63:31-48 

Caspar,  RJ  (1996)  Experimental  and  numerical  study  of  shear  localization  as  an  initiation  mecha¬ 
nism  in  energetic  solids.  MS  thesis.  Dept  of  Aero  and  Mech  Engr,  University  of  Notre  Dame 

Gonthier  KA  (1996)  A  numerical  investigation  of  the  evolution  of  self-propagating  detonation  in 
energetic  granular  solids.  PhD  dissertation.  Dept  of  Aero  and  Mech  Engr,  University  of  Notre 
Dame 

Gonthier  KA,  and  Powers  JM  (1996)  A  numerical  investigation  of  transient  detonation  in  granulated 
material.  Shock  Waves  6:183-195 

Gonthier  KA,  and  Powers  JM  (1997)  A  numerical  investigation  of  self-propagating  two-phase  det¬ 
onation.  submitted  to  the  16‘^  International  Colloquium  on  the  Dynamics  of  Explosions  and 
Reactive  Systems,  Cracow,  Poland 

Gonthier  KA,  and  Powers  JM  (1997)  A  high- resolution  upwind  scheme  for  two-phase  continuum 
DDT  models.  J  Comp  Phys  (in  prepzu'ation) 

Powers  JM,  Stewart  DS,  and  Krier  H  (1990a)  Theory  of  two-phase  detonation  -  part  I:  modeling. 
Combust  and  Flame  80:264-279 

Powers  JM,  Stewart  DS,  and  Krier  H  (1990b)  Theory  of  two-phase  detonation  -  part  II:  structure. 
Combust  and  Flame  80:280-303 

Powers  JM,  Caspar  RJ,  and  Mason,  JJ  (1997)  Modeling  and  experimental  investigation  of  reac¬ 
tive  shear  bands  in  energetic  solids  loaded  in  tension,  submitted  to  the  16*'*  International 
Colloquium  on  the  Dynamics  of  Explosions  and  Reactive  Systems,  Cracow,  Poland 


15-9 


5000 


}00 


X  =210|1S 


^  3000 « 

g  compaction  wave 

3  \ 


end  of  reaction 
zone  ($2  =  0)  ^ 


A  ^****=55 

4  (cm)  « 


50  T  (HJ) 


1000F  reflected 

E  rarefaction. 


0  10  20  ^  30  40  50 

^  (cm) 


Figure  5:  Predicted  gas  velocity  history  for  the  shocked  gas-unshocked  solid  weak  detonation  simu¬ 
lation. 
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Figure  2:  Comparison  of  the  shocked  gas-unshocked  solid  weak  detonation  structure  predicted  by  the  numerical 
simulation  and  the  steady-state  analysis:  (cijb)  gas  and  solid  Mach  number  SQuared  (relative  to  the  wave); 

(c)  solid  volume  fraction;  and  (d)  particle  radius. 
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